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Abstract. We develop and study the complexity of prepositional proof systems of varying strength 
extending resolution by allowing it to operate with disjunctions of linear equations instead of 
clauses. We demonstrate polynomial-size refutations for hard tautologies like the pigeonhole prin- 
ciple, Tseitin graph tautologies and the clique-coloring tautologies in these proof systems. Using 
the (monotone) interpolation by a communication game technique we establish an exponential-size 
lower bound on refutations in a certain, considerably strong, fragment of resolution over linear 
equations, as well as a general polynomial upper bound on (non-monotone) interpolants in this 
fragment. 

We then apply these results to extend and improve previous results on multilinear proofs (over 
fields of characteristic 0), as studied in [RT06]. Specifically, we show the following: 

• Proofs operating with depth-3 multilinear formulas polynomially simulate a certain, consid- 
erably strong, fragment of resolution over linear equations. 

• Proofs operating with depth-3 multilinear formulas admit polynomial-size refutations of the 
pigeonhole principle and Tseitin graph tautologies. The former improve over a previous result 
that established small multilinear proofs only for the functional pigeonhole principle. The 
latter are different than previous proofs, and apply to multilinear proofs of Tseitin mod p 
graph tautologies over any field of characteristic 0. 

We conclude by connecting resolution over linear equations with extensions of the cutting planes 
proof system. 
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1. Introduction 



This paper considers two kinds of proof systems. The first kind are extensions of resolution 
that operate with disjunctions of linear equations with integral coefficients instead of clauses. The 
second kind are algebraic proof systems operating with multilinear arithmetic formulas. Proofs in 
both kinds of systems establish the unsatisfiability of formulas in conjunctive normal form (CNF). 
We are primarily concerned with connections between these two families of proof systems and with 
extending and improving previous results on multilinear proofs. 

The resolution system is a popular propositional proof system that establishes the unsatisfiability 
of CNF formulas (or equivalently, the truth of tautologies in disjunctive normal form) by operating 
with clauses (a clause is a disjunction of propositional variables and their negations). It is well 
known that resolution cannot provide small (that is, polynomial-size) proofs for many basic count- 
ing arguments. The most notable example of this are the strong exponential lower bounds on the 
resolution refutation size of the pigeonhole principle and its different variants (Haken [Hak85] was 
the first to establish such a lower bound; see also [Razb02] for a survey on the proof complexity of 
the pigeonhole principle). Due to the popularity of resolution both in practice, as the core of many 
automated theorem provers, and as a theoretical case-study in propositional proof complexity, it 
is natural to consider weak extensions of resolution that can overcome its inefficiency in provid- 
ing proofs of counting arguments. The proof systems we present in this paper are extensions of 
resolution, of various strength, that are suited for this purpose. 

Propositional proof systems of a different nature that also attracted much attention in proof 
complexity theory are algebraic proof systems, which are proof systems operating with (multivariate) 
polynomials over a field. In this paper, we are particularly interested in algebraic proof systems 
that operate with multilinear polynomials represented as multilinear arithmetic formulas, called by 
the generic name multilinear proofs (a polynomial is multilinear if the power of each variable in its 
monomials is at most one) . The investigation into such proof systems was initiated in [RT06] , and 
here we continue this line of research. This research is motivated on the one hand by the apparent 
considerable strength of such systems; and on the other hand, by the known super-polynomial 
size lower bounds on multilinear formulas computing certain important functions [Raz04, Raz06], 
combined with the general working assumption that establishing lower bounds on the size of objects 
a proof system manipulates (in this case, multilinear formulas) is close to establishing lower bounds 
on the size of the proofs themselves. 
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The basic proof system we shall study is denoted R(lin). The proof-lines in R(lin) proofs are 
disjunctions of linear equations with integral coefficients over the variables x = x\, . . . , x n . It 
turns out that (already proper subsystems of) R(lin) can handle very elegantly basic counting 
arguments. The following defines the R(lin) proof system. Given an initial CNF, we translate 
every clause Viei" x * V VjeJ ~ nX i ( wnere I are the indices of variables with positive polarities and 
J are the indices of variables with negative polarities) pertaining to the CNF, into the disjunction 
Vie/( x * = 1) ^ Vje,/( x i = 0)- Let A and B be two disjunctions of linear equations, and let a-x = clq 
and b ■ x = bo be two linear equations (where a, b are two vectors of n integral coefficients, and 
a ■ x is the scalar product Y17=i a i x ii an d similarly for b ■ x). The rules of inference belonging to 
R(lin) allow to derive Ay By ((a + b) ■ x = ao + 60) from A V (a • x = ao) and B V (6 • x = bo) (or 
similarly, to derive A V B V ((a — b) ■ x = ao — ^0) from A V (a • x = ao) an d By (b- x = bo)). We can 
also simplify disjunctions by discarding (unsatisfiable) equations of the form (0 = k), for k 7^ 0. 
In addition, for every variable Xi, we shall add an axiom (xj = 0) V (x{ = 1), which forces Xi to 
take on only Boolean values. A derivation of the empty disjunction (which stands for false) from 
the (translated) clauses of a CNF is called an R(lin) refutation of the given CNF. This way, every 
unsatisfiable CNF has an R(lin) refutation (this can be proved by a straightforward simulation of 
resolution by R(lin)). 

The basic idea connecting resolution operating with disjunctions of linear equations and multilin- 
ear proofs is this: Whenever a disjunction of linear equations is simple enough — and specifically, 
when it is close to a symmetric function, in a manner made precise — then it can be represented 
by a small size and small depth multilinear arithmetic formula over fields of characteristic 0. This 
idea was already used (somewhat implicitly) in [RT06] to obtain polynomial-size multilinear proofs 
operating with depth-3 multilinear formulas of the functional pigeonhole principle (this principle 
is weaker than the pigeonhole principle). In the current paper we generalize previous results on 
multilinear proofs by fully using this idea: We show how to polynomially simulate with multilinear 
proofs, operating with small depth multilinear formulas, certain short proofs carried inside resolu- 
tion over linear equations. This enables us to provide new polynomial-size multilinear proofs for 
certain hard tautologies, improving results from [RT06]. 

More specifically, we introduce a certain fragment of R(lin), which can be polynomially simu- 
lated by depth-3 multilinear proofs (that is, multilinear proofs operating with depth-3 multilinear 
formulas). On the one hand this fragment of resolution over linear equations already is sufficient 
to formalize in a transparent way basic counting arguments, and so it admits small proofs of the 
pigeonhole principle and the Tseitin mod p formulas (which yields some new upper bounds on 
multilinear proofs); and on the other hand we can use the (monotone) interpolation technique to 
establish an exponential-size lower bound on refutations in this fragment as well as demonstrating a 
general (non-monotone) polynomial upper bound on interpolants for this fragment. The possibility 
that multilinear proofs (possibly, operating with depth-3 multilinear formulas) possess the feasible 
monotone interpolation property (and hence, admit exponential-size lower bounds) remains open. 

Another family of propositional proof systems we discuss in relation to the systems mentioned 
above are the cutting planes system and its extensions. The cutting planes proof system operates 
with linear inequalities with integral coefficients, and this system is very close to the extensions 
of resolution we present in this paper. In particular, the following simple observation can be used 
to polynomially simulate cutting planes proofs with polynomially bounded coefficients (and some 
of its extensions) inside resolution over linear equations: The truth value of a linear inequality 
a • x > ao (where a is a vector of n integral coefficients and x is a vector of n Boolean variables) is 



Each element (usually a formula) of a proof-sequence is referred to as a proof-line. 
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equivalent to the truth value of the following disjunction of linear equalities: 
(a ■ x = ao) V (a • x = ao + 1) V • • • V (a • x = ao + k) , 
where ao + k equals the sum of all positive coefficients in a (that is, ao + k = max (a - x)). 

x£{0,l} n 

Note on terminology. All the proof systems considered in this paper intend to prove the unsatis- 
fiability over 0, 1 values of collections of clauses (possibly, of translation of the clauses to disjunctions 
of linear equations). In other words, proofs in such proof systems intend to refute the collections 
of clauses, which is to validate their negation. Therefore, throughout this paper we shall sometime 
speak about refutations and proofs interchangeably, always intending refutations, unless otherwise 
stated. 

1.1. Comparison to Earlier Work. To the best of our knowledge this paper is the first that 
considers resolution proofs operating with disjunctions of linear equations. Previous works consid- 
ered extensions of resolution over linear inequalities augmented with the cutting planes inference 
rules (the resulting proof system denoted R(CP)). In full generality, we show that resolution over 
linear equations can polynomially simulate R(CP) when the coefficients in all the inequalities are 
polynomially bounded (however, the converse is not known to hold). On the other hand, we shall 
consider a certain fragment of resolution over linear equations, in which we do not even know how to 
polynomially simulate cutting planes proofs with polynomially bounded coefficients in inequalities 
(let alone R(CP) with polynomially bounded coefficients in inequalities). We now shortly discuss 
the previous work on R(CP) and related proof systems. 

Extensions of resolution to disjunctions of linear inequalities were first considered by Krajfcek 
[Kra98] who developed the proof systems LK(CP) and R(CP). The LK(CP) system is a first-order 
(Gentzen-style) sequent calculus that operates with linear inequalities instead of atomic formulas 
and augments the standard first-order sequent calculus inference rules with the cutting planes 
inference rules. The R(CP) proof system is essentially resolution over linear inequalities, that is, 
resolution that operates with disjunctions of linear inequalities instead of clauses. 

The main motivation of [Kra98] is to extend the feasible interpolation technique and consequently 
the lower bounds results, from cutting planes and resolution to stronger proof systems. That paper 
establishes an exponential-size lower bound on a restricted version of R(CP) proofs, namely, when 
the number of inequalities in each proof-line is 0(n £ ), where n is the number of variables of the 
initial formulas, e is a small enough constant and the coefficients in the cutting planes inequalities 
are polynomially bounded. 

Other papers considering extensions of resolution over linear inequalities are the more recent 
papers by Hirsch & Kojevnikov [HK06] and Kojevnikov [Koj07]. The first paper [HK06] considers 
a combination of resolution with LP (an incomplete subsystem of cutting planes based on simple 
linear programming reasoning), with the 'lift and project' proof system (L&P), and with the cutting 
planes proof system. The second paper [Koj07] deals with improving the parameters of the tree-like 
R(CP) lower-bounds obtained in [Kra98]. 

Whereas previous results concerned primarily with extending the cutting planes proof system, 
our foremost motivation is to extend and improve previous results on algebraic proof systems 
operating with multilinear formulas obtained in [RT06]. In that paper the concept of multilinear 
proofs was introduced and several basic results concerning multilinear proofs were proved. In 
particular, polynomial-size proofs of two important combinatorial principles were demonstrated: 
the functional pigeonhole principle and the Tseitin (mod p) graph tautologies. In the current paper 
we improve both these results. 

As mentioned above, motivated by relations with multilinear proofs operating with depth-3 mul- 
tilinear formulas, we shall consider a certain subsystem of resolution over linear equations. For 
this subsystem we apply twice the interpolation by a communication game technique. The first 
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application is of the non-monotone version of the technique, and the second application is of the 
monotone version. Namely, the first application provides a general (non-monotone) interpolation 
theorem that demonstrates a polynomial (in the size of refutations) upper bound on interpolants; 
The proof uses the general method of transforming a refutation into a Karchmer-Wigderson com- 
munication game for two players, from which a Boolean circuit is then attainable. In particular, 
we shall apply the interpolation theorem of Krajfcek from [Kra97]. The second application of the 
(monotone) interpolation by a communication game technique is implicit and proceeds by using 
the lower bound criterion of Bonet, Pitassi & Raz in [BPR97]. This criterion states that (semantic) 
proof systems (of a certain natural and standard kind) whose proof-lines (considered as Boolean 
functions) have low communication complexity cannot prove efficiently a certain tautology (namely, 
the clique-coloring tautologies). 

1.2. Summary of Results. This paper introduces and connects several new concepts and ideas 
with some known ones. It identifies new extensions of resolution operating with linear equations, 
and relates (a certain) such extension to multilinear proofs. The upper bounds for the pigeonhole 
principle and Tseitin mod p formulas in fragments of resolution over linear equations are new. By 
generalizing the machinery developed in [RT06], these upper bounds yield new and improved re- 
sults concerning multilinear proofs. The lower bound for the clique-coloring formulas in a fragment 
of resolution over linear equations employs the standard monotone interpolation by a communica- 
tion game technique, and specifically utilizes the theorem of Bonet, Pitassi & Raz from [BPR97]. 
The general (non-monotone) interpolation result for a fragment of resolution over linear equations 
employs the theorem of Krajfcek from [Kra97]. The upper bound in (the stronger variant of - 
as described in the introduction) resolution over linear equations of the clique-coloring formulas 
follows that of Atserias, Bonet & Esteban [ABE02]. We now give a detailed outline of the results 
in this paper. 

The proof systems. In Section 3 we formally define two extensions of resolution of decreasing 
strength allowing resolution to operate with disjunctions of linear equations. The size of a linear 
equation a\X\ + . . . + a n x n = oq is the sum of all ao, ■ ■ • , a n written in unary notation. The size of 
a disjunction of linear equations is the total size of all linear equations in the disjunction. The size 
of a proof operating with disjunctions of linear equations is the total size of all the disjunctions in 
it. 

R(lin): This is the stronger proof system (described in the introduction) that operates with 
disjunctions of linear equations with integer coefficients. 

R°(lin): This is a (provably proper) fragment of R(lin). It operates with disjunctions of (arbi- 
trarily many) linear equations whose variables have constant coefficients, under the restriction that 
every disjunction can be partitioned into a constant number of sub-disjunctions, where each sub- 
disjunction either consists of linear equations that differ only in their free-terms or is a (translation 
of a) clause. 

Note that any single linear inequality with Boolean variables can be represented by a disjunction 
of linear equations that differ only in their free-terms (see the example in the introduction section). 
So the R°(lin) proof system is close to a proof system operating with disjunctions of constant 
number of linear inequalities (with constant integral coefficients). In fact, disjunctions of linear 
equations varying only in their free-terms, have more (expressive) strength than a single inequality. 
For instance, the parity function can be easily represented by a disjunction of linear equations, 
while it cannot be represented by a single linear inequality (or even by a disjunction of linear 
inequalities). 

As already mentioned, the motivation to consider the restricted proof system R°(lin) comes from 
its relation to multilinear proofs operating with depth-3 multilinear formulas (in short, depth-3 
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multilinear proofs): R°(lin) corresponds roughly to the subsystem of R(lin) that we know how 
to simulate by depth-3 multilinear proofs via the technique in [RT06] (the technique is based on 
converting disjunctions of linear forms into symmetric polynomials, which are known to have small 
depth-3 multilinear formulas). This simulation is then applied in order to improve over known 
upper bounds for depth-3 multilinear proofs, as R°(lin) is already sufficient to efficiently prove 
certain "hard tautologies". Moreover, we are able to establish an exponential lower bound on 
R°(lin) refutations size (see below for both upper and lower bounds on R°(lin) proofs). We also 
establish a super-polynomial separation of R(lin) from R°(lin) (via the clique-coloring principle, for 
a certain choice of parameters; see below). 

Short refutations. We demonstrate the following short refutations in R°(lin) and R(lin): 

(1) Polynomial-size refutations of the pigeonhole principle in R°(lin); 

(2) Polynomial-size refutations of Tseitin mod p graph formulas in R°(lin); 

(3) Polynomial-size refutations of the clique-coloring formulas in R(lin) (for certain parameters). 
The refutations here follow by direct simulation of the Res(2) refutations of clique-coloring 
formulas from [ABE02]. 

All the three families of formulas above are prominent "hard tautologies" in proof complexity 
literature, which means that strong size lower bounds on proofs in various proof systems are known 
for them (for the exact formulation of these families of formulas see Section 6). 
Interpolation results. We provide a polynomial upper-bound on (non-monotone) interpolants 
corresponding to R°(lin) refutations; Namely, we show that any R°(lin)-refutation of a given formula 
can be transformed into a (non-monotone) Boolean circuit computing the corresponding interpolant 
function of the formula (if there exists such a function), with at most a polynomial increase in size. 
We employ the general interpolation theorem of Krajicek [Kra97] for semantic proof systems. 
Lower bounds. We provide the following exponential lower bound: 

Theorem 1. R°(lin) does not have sub- exponential refutations for the clique- coloring formulas. 

This result is proved by applying a result of Bonet, Pitassi & Raz [BPR97], that (implicitly) use 
the monotone interpolation by a communication game technique for establishing an exponential- 
size lower bound on refutations of general semantic proof systems operating with proof-lines of low 
communication complexity. 

Applications to multilinear proofs. Multilinear proof systems are (semantic) refutation sys- 
tems operating with multilinear polynomials over a fixed field, where every multilinear polynomial 
is represented by a multilinear arithmetic formula. In this paper we shall consider multilinear 
formulas over fields of characteristic only. The size of a multilinear proof (that is, a proof in 
a multilinear proof system) is the total size of all multilinear formulas in the proof (for formal 
definitions concerning multilinear proofs see Section 9). 

We shall first connect multilinear proofs with resolution over linear equations by the following 
result: 

Theorem 2. Multilinear proofs operating with depth-3 multilinear formulas over characteristic 
polynomially- simulate R°(lin). 

An immediate corollary of this theorem and the upper bounds in R°(lin) described above are 
polynomial-size multilinear proofs for the pigeonhole principle and the Tseitin mod p formulas. 

(1) Polynomial-size depth-3 multilinear refutations for the pigeonhole principle over fields of 
characteristic 0. This improves over [RT06] that shows a similar upper bound for a weaker 
principle, namely, the functional pigeonhole principle. 

(2) Polynomial-size depth-3 multilinear refutations for the Tseitin mod p graph formulas over 
fields of characteristic 0. These refutations are different than those demonstrated in [RT06], 

6 



and further they establish short multilinear refutations of the Tseitin mod p graph formulas 
over any field of characteristic (the proof in [RT06] showed how to refute the Tseitin mod 
p formulas by multilinear refutations only over fields that contain a primitive pth root of 
unity) . 

Relations with cutting planes proofs. As mentioned in the introduction, a proof system com- 
bining resolution with cutting planes was presented by Krajfcek in [Kra98]. The resulting system 
is denoted R(CP) (see Section 10 for a definition). When the coefficients in the linear inequalities 
inside R(CP) proofs are polynomially bounded, the resulting proof system is denoted R(CP*). We 
establish the following simulation result: 

Theorem 3. R(lin) polynomially simulates resolution over cutting planes inequalities with polyno- 
mially bounded coefficients R(CP*). 

We do not know if the converse also holds. 

2. Notation and Background on Propositional Proof Systems 

For a natural number n, we use [n] to denote {1, . . . , n}. For a vector of n (integral) coefficients 
a and a vector of n variables x, we denote by a ■ x the scalar product Y17= l a % x i- If & i s another 
vector (of length n), then a + b denotes the addition of a and b as vectors, and ca (for an integer 
c) denotes the product of the scalar c with a (where, —a denotes —la). For two linear equations 
L\ : a ■ x = ao and L<i : b ■ x = bo, their addition (a + b) • x = ao + bo is denoted L\ + L2 (and their 
subtraction (a — b) ■ x = ao — bo is denoted L\ — L2). For two Boolean assignments (identified as 
0, 1 strings) a, a' € {0, l} n we write a! > a if ol i > Oj, for all % G [n] (where ctj, ct i are the ith bits 
of a and a', respectively). 

We now recall some basic concepts on propositional proof systems. For background on algebraic 
proof systems (and specifically multilinear proofs) see Section 9. 

Resolution. In order to put our work in context, we need to define the resolution refutation system. 

A CNF formula over the variables Xi,...,x n is defined as follows. A literal is a variable X\ or 
its negation -iXj. A clause is a disjunction of literals. A CNF formula is a conjunction of clauses. 
The size of a clause is the number of literals in it. 

Resolution is a complete and sound proof system for unsatisfiable CNF formulas. Let C and D 
be two clauses containing neither Xi nor -iXj, the resolution rule allows one to derive C V D from 
C V Xi and D V ~^Xi. The clause C V D is called the resolvent of the clauses C V Xi and D V -iXj on 
the variable Xi, and we also say that CV ' x^ and D V -iXj were resolved over Xj. The weakening rule 
allows to derive the clause C V D from the clause C, for any two clauses C, D. 

Definition 2.1 (Resolution). A resolution proof of the clause D from a CNF formula K is a 
sequence of clauses D\, D2, ■ ■ ■ , D% , such that: (1) each clause Dj is either a clause of K or a 
resolvent of two previous clauses in the sequence or derived by the weakening rule from a previous 
clause in the sequence; (2) the last clause Dp = D. The size of a resolution proof is the sum of all 
the sizes of the clauses in it. A resolution refutation of a CNF formula K is a resolution proof of 
the empty clause □ from K (the empty clause stands for false; that is, the empty clause has no 
satisfying assignments). 

A proof in resolution (or any of its extensions) is called also a derivation or a proof- sequence. 
Each sequence-element in a proof-sequence is called also a proof -line. A proof-sequence containing 
the proof- lines D±, . . . , Dg is also said to be a derivation of D±, . . . , Dg. 
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Cook-Reckhow proof systems. Following [CR79], a Cook-Reckhow proof system is a polynomial- 
time algorithm A that receives a Boolean formula F (for instance, a CNF) and a string tt over some 
finite alphabet ("the (proposed) refutation" of F), such that there exists a tt with A(.F, ir) = 1 if 
and only if i 7 is unsatisfiable. The completeness of a (Cook-Reckhow) proof system (with respect 
to the set of all unsatisfiable Boolean formulas; or for a subset of it, e.g. the set of unsatisfiable 
CNF formulas) stands for the fact that every unsatisfiable formula F has a string tt ( "the refutation 
of F") so that A(F,ir) = 1. The soundness of a (Cook-Reckhow) proof system stands for the fact 
that every formula F so that A(F, tt) = 1 for some string tt is unsatisfiable (in other words, no 
satisfiable formula has a refutation). 

For instance, resolution is a Cook-Reckhow proof system, since it is complete and sound for the 
set of unsatisfiable CNF formulas, and given a CNF formula F and a string tt it is easy to check in 
polynomial-time (in both F and tt) whether tt constitutes a resolution refutation of F. 

We shall also consider proof systems that are not necessarily (that is, not known to be) Cook- 
Reckhow proof systems. Specifically, multilinear proof systems (over large enough fields) meet the 
requirements in the definition of Cook-Reckhow proof systems, except that the condition on A 
above is relaxed: we allow A to be in probabilistic polynomial-time BPP (which is not known to 
be equal to deterministic polynomial-time). 

Polynomial simulations of proof systems. When comparing the strength of different proof 
systems we shall confine ourselves to CNF formulas only. That is, we consider propositional proof 
systems as proof systems for the set of unsatisfiable CNF formulas. For that purpose, if a proof 
system does not operate with clauses directly, then we fix a (direct) translation from clauses to 
the objects operated by the proof system. This is done for both resolution over linear equations 
(which operate with disjunctions of linear equations) and its fragments, and also for multilinear 
proofs (which operate with multilinear polynomials, represented as multilinear formulas); see for 
example Subsection 3.1 for such a direct translation. 

Definition 2.2. Let V\,V2 be two proof systems for the set of unsatisfiable CNF formulas (we 
identify a CNF formula with its corresponding translation, as discussed above). We say that V<i 
polynomially simulates V\ if given a V\ refutation tt of a CNF formula F, then there exists a 
refutation of F in V2 of size polynomial in the size of tt. In case V2 polynomially simulates V\ while 
V\ does not polynomially simulates V2 we say that V2 is strictly stronger than V\. 

3. Resolution over Linear Equations and its Subsystems 

The proof systems we consider in this section are extensions of resolution. Proof-lines in res- 
olution are clauses. Instead of this, the extensions of resolution we consider here operate with 
disjunctions of linear equations with integral coefficients. For this section we use the convention 
that all the formal variables in the propositional proof systems considered are taken from the set 
X := {xi, . . .,x n }. 

3.1. Disjunctions of Linear Equations. For L a linear equation a\x\ + . . . + a n x n = ao, the 

right hand side ao is called the free-term of L and the left hand side a\X\ + . . . + a n x n is called the 
linear form of L (the linear form can be 0). A disjunction of linear equations is of the following 
general form: 

(a ( i ) x 1 + ... + a^x n = V • • • V (o^xi + . . . + a®x n = a^) , (1) 

where t > and the coefficients are integers (for all < i < n, 1 < j < t). We discard duplicate 
linear equations from a disjunction of linear equations. The semantics of such a disjunction is the 
natural one: We say that an assignment of integral values to the variables xi,...,x n satisfies (1) 

8 



if and only if there exists j G [t] so that the equation af x\ + . . . + atf x n = Oq holds under the 
given assignment. 

The symbol |= denotes the semantic implication relation, that is, for every collection Di, . . . , D rn 
of disjunctions of linear equations, 

D± , . . . , D m |= Dq 

means that every assignment of 0, 1 values that satisfies all D\, . . . ,D m also satisfies Do. 2 In this 
case we also say that Di, . . . , D m semantically imply Do- 

The size of a linear equation a\X\ + . . . + a n x n = ao is XT=o l°*l> i- e -' the sum °^ the bit sizes 
of all a, written in unary notation. Accordingly, the size of the linear form a\x\ + . . . + a n x n is 
Y^i=i \ a i\- The size of a disjunction of linear equations is the total size of all linear equations in it. 

Since all linear equations considered in this paper are of integral coefficients, we shall speak 
of linear equations when we actually mean linear equations with integral coefficients. Similar to 
resolution, the empty disjunction is unsatisfiable and stands for the truth value false. 

Translation of clauses. As described in the introduction, we can translate any CNF formula to 
a collection of disjunctions of linear equations in a direct manner: Every clause \f ie j Xi V Vj g j ~^ x j 
(where I and J are sets of indices of variables) pertaining to the CNF is translated into the 
disjunction Vie/( x i = 1) V Vjej( x i = 0)- For a clause D we denote by D its translation into a 
disjunction of linear equations. It is easy to verify that any Boolean assignment to the variables 
xi, . . . ,x n satisfies a clause D if and only if it satisfies D (where true is treated as 1 and false as 
0). 

3.2. Resolution over Linear Equations — R(lin). Defined below is our basic proof system 
R(lin) that enables resolution to reason with disjunctions of linear equations. As we wish to reason 
about Boolean variables we augment the system with the axioms (a^ = 0) V (xj = 1), for all i G [n], 
called the Boolean axioms. 

Definition 3.1 (R(lin)). Let K := {K±, . . . , K m } be a collection of disjunctions of linear equations. 
An R(lin) -proof from K of a disjunction of linear equations D is a finite sequence ir = (D\, ...,Di) 
of disjunctions of linear equations, such that Dg = D and for every i G [£], either Di = Kj for 
some j G [m], or Di is a Boolean axiom (x^ = 0) V (x^ = 1) for some h G [n], or Di was deduced 
by one of the following R(lin)-inference rules, using Dj, D^ for some j, k < i: 

Resolution: Let A, B be two disjunctions 3 of linear equations and let L±, L2 be two linear 
equations. 

From A V L\ and B\J L 2 derive 4vBV(Li + L 2 ). 

Similarly, from ivii and B V L 2 derive A\/ B \J (Li - L 2 ). 
Weakening: From a disjunction of linear equations A derive Ay L , where L is an arbitrary 
linear equation over X. 

Simplification: From A V (0 = k) derive A, where A is a disjunction of linear equations 
and k ^ 0. 

An R(lin) refutation of a collection of disjunctions of linear equations K is a proof of the empty 
disjunction from K. The size of an R(lin)-proof it is the total size of all the disjunctions of linear 
equations in ir, denoted | vr | . 

Similar to resolution, in case A V B V (L± + L 2 ) is derived from A V L\ and B V L 2 by the 
resolution rule, we say that A V L\ and BV L 2 were resolved over L\ and L 2 , respectively, and we 

"^Alternatively, we can consider assignments of any integral values (instead of only Boolean values) to the variables 
in Di, . . . , D m , stipulating that the collection Di, . . . , D m contains all disjunctions of the form (xj = 0) V (xj = 1) 
for all the variables Xj £ X (these formulas force any satisfying assignment to give only 0, 1 values to the variables). 

"^Possibly the empty disjunction. This remark also applies to the inference rules below. 
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call A V B V (L\ + L 2 ) the resolvent of A V Li and B V L 2 (and similarly, when ^4 V Z? V (Li - L 2 ) 
is derived from i V Li and B V L 2 by the resolution rule; we use the same terminology for both 
addition and subtraction, and it should be clear from the context which operation is actually 
applied). We also describe such an application of the resolution rule by saying that L\ was added 
(resp., subtracted) to (resp. from) L 2 in AV L\ and B V L 2 . 

In light of the direct translation between CNF formulas and collections of disjunctions of linear 
equations (described in the previous subsection), we can consider R(lin) to be a proof system for 
the set of unsatisfiable CNF formulas: 

Proposition 1. The R(lin) refutation system is a sound and complete Cook-Reckhow (see Sec- 
tion 2) refutation system for unsatisfiable CNF formulas (translated into unsatisfiable collection of 
disjunctions of linear equations). 

Proof: Completeness of R(lin) (for the set of unsatisfiable CNF formulas) stems from a straight- 
forward simulation of resolution, as we now show. 

Claim 1. R(lin) polynomially simulates resolution. 

Proof of claim: Proceed by induction on the length of the resolution refutation to show that any 
resolution derivation of a clause A can be translated with only a linear increase in size into an R(lin) 
derivation of the corresponding disjunction of linear equations A (see the previous subsection for 
the definition of A). 

The base case: An initial clause A is translated into its corresponding disjunction of linear 
equations A. 

The induction step: If a resolution clause Ay B was derived by the resolution rule from A V Xi 
and B V -iXj, then in R(lin) we subtract (xj = 0) from (xj = 1) in B V (x» = 0) and A V (xj = 1), 
respectively, to obtain A V-B V (0 = 1). Then, using the Simplification rule, we can cut-off (0 = 1) 
from A V B V (0 = 1), and arrive at A V B. 

If a clause A V B was derived in resolution from A by the Weakening rule, then we derive A V B 
from A by the Weakening rule in R(lin). ■ 

Soundness of R(lin) stems from the soundness of the inference rules (which means that: If D 
was derived from C, B by the R(lin) resolution rule then any assignment that satisfies both C and 
B also satisfies D; and if D was derived from C by either the Weakening rule or the Simplification 
rule, then any assignment that satisfies C also satisfies D). 

The R(lin) proof system is a Cook-Reckhow proof system, as it is easy to verify in polynomial- 
time whether an R(lin) proof-line is inferred, by an application of one of R(lin)'s inference rules, 
from a previous proof-line (or proof-lines). Thus, any sequence of disjunctions of linear equations, 
can be checked in polynomial-time (in the size of the sequence) to decide whether or not it is a 
legitimate R(lin) proof-sequence. ■ 

In Section 5 we shall see that a stronger notion of completeness (that is, implicational complete- 
ness) holds for R(lin) and its subsystems. 

3.3. Fragment of Resolution over Linear Equations R°(lin). Here we consider a restriction 
of R(lin), denoted R°(lin). As discussed in the introduction section, R°(lin) is roughly the fragment 
of R(lin) we know how to polynomially simulate with depth-3 multilinear proofs. 

By results established in the sequel (Sections 6.3 and 8) R(lin) is strictly stronger than R°(lin), 
which means that R(lin) polynomially simulates R (lin), while the converse does not hold. 

R°(lin) operates with disjunctions of (arbitrarily many) linear equations with constant coefficients 
(excluding the free terms), under the following restriction: Every disjunction can be partitioned 
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into a constant number of sub-disjunctions, where each sub-disjunction either consists of linear 
equations that differ only in their free-terms or is a (translation of a) clause. 

As mentioned in the introduction, every linear inequality with Boolean variables can be rep- 
resented by a disjunction of linear equations that differ only in their free-terms. So the R°(lin) 
proof system resembles, to some extent, a proof system operating with disjunctions of constant 
number of linear inequalities with constant integral coefficients (on the other hand, it is probable 
that R°(lin) is stronger than such a proof system, as a disjunction of linear equations that differ 
only in their free terms is [expressively] stronger than a linear inequality [or even a disjunction of 
linear inequalities]: the former can define the parity function while the latter cannot). 

Example of an R°(lin)-ime: 

(xi + . . . + x t = 1) V • • • V (x! + . . . + x t = £) V {xt+i = 1) V • • • V (x n = 1), 

for some 1 < i < n. The next section contains other concrete (and natural) examples of R°(lin)- 
lines. 

Let us define formally what it means to be an R°(lin) proof-line, that is, a proof- line inside an 
R°(lin) proof, called R° (lin) -Hne: 

Definition 3.2 (R°(lin)-line). Let D be a disjunction of linear equations whose variables have 
constant integer coefficients (the free-terms are unbounded). Assume D can be partitioned into a 
constant number k of sub-disjunctions Di, . . . , D^, where each Di either consists of (an unbounded) 
disjunction of linear equations that differ only in their free-terms, or is a translation of a clause (as 
defined in Subsection 3.1). Then the disjunction D is called an R°(lin)-/me. 

Thus, any R°(lin)-line is of the following general form: 

V (f> ■ x = V • • • V \/ («*« • x = 4 fe) ) V \/ ( Xj = b 3 ) , (2) 

ieh iG/ fe jeJ 

where k and all a* (for r G [n] and t G [k]) are integer constants and bj G {0, 1} (for all j G J) (and 
Ji, . . . J are unbounded sets of indices). Note that a disjunction of clauses can be combined 
into a single clause. Hence, without loss of generality we can assume that in any R°(lin)-line only 
a single (translation of a) clause occurs. This is depicted in (2) (where in addition we have ignored 
in (2) the possibility that the single clause obtained by combining several clauses contains Xj V -ix.,-, 
for some j G [n]). 

Definition 3.3 (R°(lin)). The R°(lin) proof system is a restriction of the R(lin) proof system in 
which each proof-line is an R°(lin)-line (as in Definition 3.2). 

For a completeness proof of R°(lin) see Section 5. 

4. Reasoning and Counting inside R(lin) and its Subsystems 

In this section we illustrate a simple way to reason by case-analysis inside R(lin) and its subsys- 
tems. This kind of reasoning will simplify the presentation of proofs inside R(lin) (and R°(lin)) in 
the sequel (essentially, a similar - though weaker - kind of reasoning is applicable already in reso- 
lution). We will then demonstrate efficient and transparent proofs for simple counting arguments 
that will also facilitate us in the sequel. 



The simulation of resolution inside R(lin) (in the proof of Proposition 1) is carried on with each R(lin) proof-line 
being in fact a translation of a clause, and hence, an R°(lin)-line (notice that the Boolean axioms of R(lin) arc 
R°(lin)-lines). This already implies that R°(lin) is a complete refutation system for the set of unsatisfiable CNF 
formulas. In section 5 we give a proof of a stronger notion of completeness for R°(lin). 
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4.1. Basic Reasoning inside R(lin) and its Subsystems. Given K a collection of disjunc- 
tions of linear equations {Ki, . . . , K m } and C a disjunction of linear equations, denote by K V C 
the collection {K\ V C, . . . ,K m V C}. Recall that the formal variables in our proof system are 
Xl, . . . , x n . 

Lemma 4. Let K be a collection of disjunctions of linear equations, and let z abbreviate some linear 
form with integer coefficients. Let E\, . . . ,Eg be £ disjunctions of linear equations. Assume that for 
all i € [£} there is an R(lin) derivation of Ei from z = a» and K with size at most s where a±,...,a£ 
are distinct integers. Then, there is an R(lin) proof of Vi=i &i fr om K and (z = cti) V- • • V (z = ai), 
with size polynomial in s and I. 

Proof: Denote by D the disjunction [z = a±) V • • • V (z = ag) and by 7Tj the R(lin) proof of Ei from 
K and z = Oj (with size at most s), for all i € [£]. It is easy to verify that for all i £ [£] the sequence 
TTj V V :) 'e[£]\{j}(- 2 = a j) i s an R-(hn) proof of E{ V VjeMUi}^ = a j) from K and D. So overall, given 
D and K as premises, there is an R(lin) derivation of size polynomial in s and £ of the following 
collection of disjunctions of linear equations: 

EiV \/ (z = aj ),...,E £ V \J (z = aj ). (3) 

J6M\{1} ie[£]\m 

We now use the Resolution rule to cut-off all the equations (z = ai) inside all the disjunctions 
in (3). Formally, we prove that for every 1 < k < £ there is a polynomial-size (in s and £) R(lin) 
derivation from (3) of 

£iV---V£ fe V \/ (z = aj), (4) 
3eM\[fc] 

and so putting k = £, will conclude the proof of the lemma. 

We proceed by induction on k. The base case for k = 1 is immediate (from (3)). For the 
induction case, assume that for some 1 < k < £ we already have an R(lin) proof of (4), with size 
polynomial in s and £. 

Consider the line 

E k+1 V \J (z = aj ). (5) 

j'gM\{*+i} 

We can now cut-off the disjunctions Vjgm\[fc]( z = a j) an d Vj6[^]\{A:+i}( z = a j) from (4) and (5), 
respectively, using the Resolution rule (since the a,'s in (4) and in (5) are disjoint). We will 
demonstrate this derivation in some detail now, in order to exemplify a proof carried inside R(lin). 
We shall be less formal sometime in the sequel. 

Resolve (4) with (5) over (z = «fc+i) and {z = a±), respectively, to obtain 

(0 = ai-a fc+ i) V£i V--- V E k V E k+1 V \f (z = aj ) . (6) 

J6[/|\{1,*+1} 

Since a\ ^ afe+i, we can use the Simplification rule to cut-off (0 = a\ — a k +i) from (6), and we 
arrive at 

E 1 V ---V E k V E k+1 V \J (z = aj). (7) 

je[e]\{i,k+i} 

Now, similarly, resolve (4) with (7) over (z = a k+ i) and (z = 02), respectively, and use Simplification 
to obtain 

Ex V • • • V E k V E k+1 V \/ (z = aj) . 

je[/|\{l,2,fc+i} 
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Continue in a similar manner until you arrive at 

E x V • • • V E k V E k+l V \J (z = aj) , 

je[t]\{i,2,...,k,k+i} 

which is precisely what we need. ■ 

Under the appropriate conditions, Lemma 4 also holds for R°(lin) proofs. This is stated in the 
following lemma. 

Lemma 5. Let K be a collection of disjunctions of linear equations, and let z abbreviate a linear 
form with integer coefficients. Let E\, . . . , Eg be £ disjunctions of linear equations. Assume that for 
all i G [£] there is an R°(lin) derivation of Ei from z = ai and K with size at most s, where the 
di's are distinct integers. Then, assuming Vi=i ^ s an R°(lm)-/ine, there is an R°(lin) proof of 
Vi=i &i from K and (z = a\) V ■ ■ ■ V (z = ai), with size polynomial in s and I. 

Proof: It can be verified by simple inspection that, under the conditions spelled out in the state- 
ment of the lemma, each proof-line in the R(lin) derivations in the proof of Lemma 4 is actually 
an R°(lin)-line. 5 ■ 

Abbreviations. Lemmas 4 and 5 will sometime facilitate us to proceed inside R(lin) and R°(lin) 
with a slightly less formal manner. For example, the situation in Lemma 4 above can be depicted 
by saying that "if z = a% implies Ei (with a polynomial-size proof) for all i G [£], then \/ i=i (z = ai) 
implies Vi=i ^ (with a polynomial-size proof)". 

In case Vi=i( 2: = a i) above is just the Boolean axiom (xi = 0) V (xi = 1), for some i G [n], and 
Xi = implies Eq and X{ = 1 implies E\ (both with polynomial-size proofs), then to simplify the 
writing we shall sometime not mention the Boolean axiom at all. For example, the latter situation 
can be depicted by saying that "if x% = implies Eq with a polynomial-size proof and Xj = 1 implies 
Ei with a polynomial-size proof, then we can derive Eq V E\ with a polynomial-size proof" . 



4.2. Basic Counting inside R(lin) and R°(lin). In this subsection we illustrate how to effi- 
ciently prove several basic counting arguments inside R(lin) and R°(lin). This will facilitate us in 
showing short proofs for hard tautologies in the sequel. In accordance with the last paragraph in 
the previous subsection, we shall carry the proofs inside R(lin) and R°(lin) with a slightly less rigor. 

Lemma 6. Let z\ abbreviate a ■ x and z<i abbreviate b ■ x. Let D\ be \J a ^{z\ = a) and let Z?2 be 
V/3eS (z2 = (5), where A,B are two (finite) sets of integers. Then there is a polynomial- size (in the 
size of Di,D2) R(lin) proof from D\,D2 of: 

\l (zi + z 2 = a + (3) . (8) 

Moreover, if a and b consist of constant integers (which means that D\,D2 are R°(lin) -lines), then 
there is a polynomial- size (in the size of D\,D2) R°(lin) proof of (8) from D\,D2- 

Proof: Denote the elements of A by a±, . . . , a k - In case z\ = ai, for some i G [k] then we can add 
Z\ = ai to every equation in V/3eB ( Z2 = ^) ^° & e t V/?gb( Zi + 22 = «i + /?)• Therefore, there exist 
k R(lin) proofs, each with polynomial-size (in \Di\ and I-D2I), of 

V {zi + z 2 = ai + (3) , \J (zi + z 2 = a 2 + 0) , • • • ,\J (zi + z 2 = a k + 0) 

/36B /3eB /3ei3 

^Note that when the proofs of Ei from z = ai, for all i € [£], are all done inside R°(lin), then the linear form z 
ought to have constant coefficients. 
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from z\ = cki, z\ = «2 r • • >zi = afc, respectively. 
Thus, by Lemma 4, we can derive 

\j ( Zl + z 2 = a + p) (9) 

from D\ and Z?2 in a polynomial-size (in and I-D2I) R(lin)-proof. This concludes the first part 
of the lemma. 

Assume that a and 6 consist of constant coefficients only. Then by inspecting the R(lin)-proof 
of (9) from D± and D2 demonstrated above (and by using Lemma 5 instead of Lemma 4), one can 
verify that this proof is in fact carried inside R°(lin). ■ 

An immediate corollary of Lemma 6 is the efficient formalization in R(lin) of the following obvious 
counting argument: If a linear form equals some value in the interval (of integer numbers) [ao,ai] 
and another linear form equals some value in [60,61] (for some ao < ai and 60 < 61), then their 
addition equals some value in [ao + 60, a\ + 61]. More formally: 

Corollary 7. Let z\ abbreviate a- x and Z2 abbreviate b-x. Let D\ be {z\ = ao) V (z\ = ao + 1) • . . V 
(zi = a\), and let D2 be (z2 = bo) V (z2 = 60 + 1) . . . V (z2 = 61). Then there is a polynomial- size 
(in the size of D\,D2) R(lin) proof from D\,D2 of 

(zi + z 2 = a + 6 ) V (zi + z 2 = a + 6 + 1) V . . . V (z 1 + z 2 = a Y + 61) . (10) 

Moreover, if a and 6 consist of constant integers (which means that D\,T>2 are R°(lin) -lines), then 
there is a polynomial- size (in the size of D\,D2) R°(lin) proofs of (10) from D\,D2- 

Lemma 8. Let a ■ x be a linear form with n variables, and let A := {a ■ x \ x £ {0, 1}™} be the set 
of all possible values of a- x over Boolean assignments to x. Then there is a polynomial- size, in the 
size of the linear form a ■ x, 6 R(lin) proof of 

\J (a-x = a). (11) 

Moreover, if the coefficients in a are constants, then there is a polynomial- size (in the size ofa-x) 
R°(lin) proof of (11). 

Proof: Without loss of generality, assume that all the coefficients in a are nonzero. Consider the 
Boolean axiom (x\ = 0) V (x\ = 1) and the (first) coefficient a\ from a. Assume that a\ > 1. Add 
{x\ = 0) to itself a\ times, and arrive at {a\X\ = 0) V (xi = 1). Then, in the resulted line, add 
(x\ = 1) to itself a\ times, until the following is reached: 

{a\X\ = 0) V (a±xi = a\) . 

Similarly, in case a\ < —1 we can subtract (|ai| + 1 many times) {x\ = 0) from itself in (x\ = 
0) V (x\ = 1), and then subtract (|ai| + 1 many times) (x\ = 1) from itself in the resulted line. 

In the same manner, we can derive the disjunctions: (02^2 = 0) V (02^2 = 02), . . . , (a n x n = 
0) V (a n x n = a n ). 

Consider {a\X\ = 0) V {a\x\ = a\) and (02^2 = 0) V (02^2 = 02)- From these two lines, by 
Lemma 6, there is a polynomial-size in |ai| + \a2\ derivation of: 

(a\xi + a 2 X2 = 0) V (a\x\ + a 2 x 2 = a{) V {a\X\ + a 2 x 2 = a 2 ) V {a\x\ + a 2 x 2 = a\ + a 2 ) . (12) 
In a similar fashion, now consider (03X3 = 0) V (a^x^ = 03) and apply again Lemma 6, to obtain 

W {a\xi + a 2 x 2 + a 3 x 3 = a) , (13) 

^Recall that the size of a ■ x is l a *l' that is, the size of the unary representation of a. 
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where A' are all possible values to a\x\ + 02X2 + 03X3 over Boolean assignments to x%, X2, X3. The 
derivation of (13) is of size polynomial in |ai| + 1 0,2 1 + 1 0,3 1 . 

Continue to consider, successively, all other lines (04X4 = 0) V (04X4 = 04), . . . , (a n x n = 0) V 
(a n x n = a n ), and apply the same reasoning. Each step uses a derivation of size at most polynomial 
in Y^!i=i \ a i\- And so overall we reach the desired line (11), with a derivation of size polynomial in 
the size of a ■ x. This concludes the first part of the lemma. 

Assume that a consists of constant coefficients only. Then by inspecting the R(lin)-proof demon- 
strated above (and by using the second part of Lemma 6), one can see that this proof is in fact 
carried inside R°(lin). ■ 

Lemma 9. There is a polynomial- size (in n) R°(lin) proof from 

(xi = 1) V • • • V (x n = 1) (14) 

of 

(xi + . . . + x n = 1) V • • • V (xi + . . . + x n = n) . (15) 

Proof: We show that for every i € [n], there is a polynomial-size (in n) R°(lin) proof from (xj = 1) 
of (xi + . . . + x n = 1) V • • • V (xi + . . . + x n = n). This concludes the proof since, by Lemma 5, 
we then can derive from (14) (with a polynomial-size (in n) R°(lin) proof) the disjunction (14) in 
which each (x« = 1) (for all i £ [n]) is replace by (xi + . . . + x n = 1) V • • • V (xi + . . . + x n = n), 
which is precisely the disjunction (15) (note that (15) is an R°(lin)-line). 

Claim 2. For every i G [n], there is a a polynomial-size (in n) R°(lin) proof from (xj = 1) of 
(xi + . . . + x n = 1) V • • • V (xi + . . . + x n = n). 

Proof of claim: By Lemma 8, for every i S [n] there is a polynomial-size (in n) R°(lin) proof 
(using only the Boolean axioms) of 

(xi + . . . + Xi-i + x i+ i + . . . + x n = 0) V • • • V (xi + . . . + Xj_i + Xj+i + . . . + x n = n — 1) . (16) 

Now add successively (xj = 1) to every equation in (16) (note that this can be done in R°(lin)). 
We obtain precisely (xi + . . . + x n = 1) V • • • V (xi + . . . + x n = n). m m 



Lemma 10. There is a polynomial-size (inn) R°(lin) proof of {x\ + . . .+x n = 0)V(xi + . . .+x n = 1) 
from the collection of disjunctions consisting of (x^ = 0) V (xj = 0), for all 1 < i < j < n. 

Proof: We proceed by induction on n. The base case for n = 1 is immediate from the Boolean 
axiom (xi = 0) V (xi = 1). Assume we already have a polynomial-size proof of 

(xi + . . . + x n = 0) V (xi + . . . + x n = 1). (17) 

If x n+ i = we add x n+ i = to both of the equations in (17), and reach: 

(xi + . . . + x n+ i = 0) V (xi + . . . + x n+ i = 1). (18) 

Otherwise, x n +i = 1, and so we can cut-off (x n +i = 0) in all the initial disjunctions (xj = 0) V 
(x n+ i = 0), for all 1 < i < n. We thus obtain (xi = 0), . . . , (x n = 0). Adding together (xi = 
0), . . . , (x n = 0) and (x n+ i = 1) we arrive at 

(xi + ... + x n+1 = 1). (19) 

So overall, either (18) holds or (19) holds; and so (using Lemma 5) we arrive at the disjunction of 
(19) and (18), which is precisely (18). ■ 
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5. Implicational Completeness of R(lin) and its Subsystems 

In this section we provide a proof of the implicational completeness of R(lin) and its subsystems. 
We shall need this property in the sequel (see Section 6.2). The implicational completeness of a 
proof system is a stronger property than mere completeness. Essentially, a system is implicationally 
complete if whenever something is semantically implied by a set of initial premises, then it is 
also derivable from the initial premises. In contrast to this, mere completeness means that any 
tautology (or in case of a refutation system, any unsatisfiable set of initial premises) has a proof in 
the system (respectively, a refutation in the system). As a consequence, the proof of implicational 
completeness in this section establishes an alternative completeness proof to that obtained via 
simulating resolution (see Proposition 1). Note that we are not concerned in this section with the 
size of the proofs, but only with their existence. 

Recall the definition of the semantic implication relation |= from Section 3.1. Formally, we say 
that R(lin) is implicationally complete if for every collection of disjunctions of linear equations 
Dq, Di, . . . , D m , it holds that D\, . . . , D m |= Dq implies that there is an R(lin) proof of Dq from 
Dx,...,D m . 

Theorem 11. R(lin) is implicationally complete. 

Proof: We proceed by induction on n, the number of variables x\, . . . , x n in Dq,D\, . . . , D m . 

The base case n = 0. We need to show that D\,. . . ,D m \= Dq implies that there is an R(lin) 
proof of Dq from D\, . . . , D m , where all -Dj's (for < i < m) have no variables but only constants. 
This means that each Di is a disjunction of equations of the form (0 = ao) for some integer ao (if 
a linear equation have no variables, then the left hand side of this equation must be 0; see Section 
3.1). 

There are two cases to consider. In the first case Dq is satisfiable. Since -Do has no variables, 
this means precisely that Dq is the equation (0 = 0). Thus, Dq can be derived easily from any 
axiom in R(lin) (for instance, by subtracting each equation in (x\ = 0) V (x\ = 1) from itself, to 
reach (0 = 0) V (0 = 0), which is equal to (0 = 0), since we discard duplicate equations inside 
disjunctions). 

In the second case -Do is unsatisfiable. Thus, since D\, . . . , D m \= Dq, there is no assignment sat- 
isfying all Di, . . . , D m . Hence, there must be at least one unsatisfiable disjunction Di in D\, . . . , D m 
(as a disjunction with no variables is either tautological or unsatisfiable). Such an unsatisfiable Di 
is a disjunction of zero or more unsatisfiable equations of the form (0 = ao), for some integer ao 7^ 0. 
We can then use Simplification to cut-off all the unsatisfiable equations in Di to reach the empty 
disjunction. By the Weakening rule, we can now derive -Do from the empty disjunction. 

The induction step. Assume that the theorem holds for disjunctions with n variables. Let the 
underlying variables of Dq, Di, . . . , D m be x±, . . . , x n+ \, and assume that 

D\ , . . . , D m |= -Do 

We write the disjunction Do as: 

\f(f2a^x i + a%x n+1 = a^) , (21) 
j=i \i=i J 

where the 's are integer coefficients. We need to show that there is an R(lin) proof of -Do from 
D\ , . . . , D m . 

Let D be a disjunction of linear equations, let Xi be a variable and let b G {0,1}. We shall 
denote by D\ Xi= i, the disjunction D, where in every equation in D the variable Xj is substituted by 
b, and the constant terms in the left hand sides of all resulting equations (after substituting b for 
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(20) 



Xi) switch sides (and change signs, obviously) to the right hand sides of the equations (we have to 
switch sides of constant terms, as by definition linear equations in R(lin) proofs have all constant 
terms appearing only on the right hand sides of equations). 

We now reason (slightly) informally inside R(lin) (as illustrated in Section 4.1). Fix some b S 
{0, 1}, and assume that x n+ i = b. Then, from D±, . . . , D m we can derive (inside R(lin)): 

Di\ Xn+1= b, ■ ■ ■ ,D m \ Xn+1=b . (22) 

The only variables occurring in (22) From assumption (20) we clearly have D\ \ Xn+1 =b 

, . . . , D m \ Xn+l=b |= Do\ Xn+1=b . And so by the induction hypothesis there is an R(lin) derivation of 
Do\ Xn+1 =b from Di\ Xn+1=b , . . . ,D m \ Xn+1=b . So overall, assuming that x n+ i = b, there is an R(lin) 
derivation of D \ Xn+1=b from D 1 ,...,D m . 

We now consider the two possible cases: x n+ \ = and x n+ \ = 1. 

In case x n+ \ = 0, by the above discussion, we can derive Do\ Xn+1 =Q from D±, . . . , D m . For every 

j G [t], add successively {a^+i times) the equation x n+ \ = to the jth equation in Dq\ Xu+1= q (see 
(21)). We thus obtain precisely Do- 
in case x n+ \ = 1, again, by the above discussion, we can derive Do\ Xn+1= i from D\, . . . , D m . For 

every j G [t], add successively (a^+i times) the equation x n+ \ = 1 to the jth equation in Dq\ Xti+1= i 
(recall that we switch sides of constant terms in every linear equation after the substitution of x n +i 
by 1 is performed in -Do[z n+ i=i)- Again, we obtain precisely Dq. m 

By inspecting the proof of Theorem 11, it is possible to verify that if all the disjunctions 
Dq, , . . . , D m are R°(lin)-lines (see Definition 3.2), then the proof of -Do in R(lin) uses only R°(lin)- 
lines as well. Therefore, we have: 

Corollary 12. R°(lin) is implicationally complete. 

Remark 1. Corollary 12 states that any R°(lin)-line that is semantically implied by a set of initial 
R°(lin)-lines, is in fact derivable in R°(lin) from the initial R°(lin)-lines. On the other hand, it is 
possible that a certain proof of the same R°(lin)-line inside R(lin) will be significantly shorter than 
the proof inside R°(lin). Indeed, we shall see in Section 8 that for certain CNF formulas R(lin) has 
a super-polynomial speed-up over R°(lin). 

6. Short Proofs for Hard Tautologies 

In this section we show that R°(lin) is already enough to admit small proofs for "hard" counting 
principles like the pigeonhole principle and the Tseitin graph formulas for constant degree graphs. 
On the other hand, as we shall see in Section 8, R°(lin) inherits the same weakness that cutting 
planes proofs have with respect to the clique-coloring tautologies. Nevertheless, we can efficiently 
prove the clique-coloring principle in (the stronger system) R(lin), but not by using R(lin) "ability 
to count", rather by using its (straightforward) ability to simulate Res(2) proofs (that is, resolution 
proofs extended to operate with 2-DNF formulas, instead of clauses). 

6.1. The Pigeonhole Principle Tautologies in R°(lin). This subsection illustrates 
polynomial-size R°(lin) proofs of the pigeonhole principle. This will allow us to establish 
polynomial-size multilinear proofs operating with depth-3 multilinear formulas of the pigeonhole 
principle (in Section 9). 

The m to n pigeonhole principle states that m pigeons cannot be mapped one-to-one into n < m 
holes. The negation of the pigeonhole principle, denoted -iPHP™, is formulated as an unsatisfiable 
CNF formula as follows (where clauses are translated to disjunctions of linear equations): 

Definition 6.1. The -iPHP™ is the following set of clauses: 

(1) Pigeons axioms: (xi,i = 1) V • • • V (xj. n = 1), for all 1 < i < m; 
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(2) Holes axioms: (xi & = 0) V (xj t k = 0), for all 1 < i < j < m and for all 1 < k < n. 
The intended meaning of each propositional variable Xij is that the ith pigeon is mapped to the 
jth hole. 

We now describe a polynomial-size in n refutation of -RHP™ inside R°(lin). For this purpose it 
is sufficient to prove a polynomial-size refutation of the pigeonhole principle when the number of 
pigeons m equals n + 1 (because the set of clauses pertaining to -RHP™ +1 is already contained in 
the set of clauses pertaining to -RHP™, for any m> n). Thus, we fix m = n+l. In this subsection 
we shall say a proof in R°(lin) is of polynomial- size, always intending polynomial- size in n (unless 
otherwise stated). 

By Lemma 9, for all i E [m] we can derive from the Pigeon axiom (for the ith. pigeon): 

(xi,i + . . . + Xi, n = 1) V • • • V (x it i + • • • + Xi >n = n) (23) 

with a polynomial-size R°(lin) proof. 

By Lemma 10, from the Hole axioms we can derive, with a polynomial-size R°(lin) proof 

(xij + . . . + x m> j = 0) V (xij + . . . + x m>j = 1), (24) 

for all j E [n] . 

Let S abbreviate the sum of all formal variables Xij. In other words, 

S := x i,j ■ 

t£[m],jg[n] 

Lemma 13. There is a polynomial- size R°(lin) proof from (23) (for all i E [m]) of 

(S = m) V (S = m + 1) • • • V (S = m ■ n). 

Proof: For every i E [m] fix the abbreviation Z{ := x^\ + . . . + Xi >n . Thus, by (23) we have 
(Zi = 1) V ••• V ( Zi = n). 

Consider (zi = 1) V • • • V (z\ = n) and (z2 = 1) V • • • V (z2 = n). By Corollary 7, we can derive 
from these two lines 

(z x + z 2 = 2) V {zi + z 2 = 3) V • • • V (z! + z 2 = 2n) (25) 

with a polynomial-size R°(lin) proof. 

Now, consider (23 = 1) V • • • V (23 = n) and (25). By Corollary 7 again, from these two lines we 
can derive with a polynomial-size R°(lin) proof: 

(zi + z 2 + z 3 = 3) V (zi + z 2 + z 3 = 4) V • • • V (zi + z 2 + z 3 = 3n) . (26) 

Continuing in the same way, we eventually arrive at 

(zi + . . . + z m = m) V (zi + . . . + z m = m + 1) V • • • V (zi + . . . + z m = m ■ n) , 

which concludes the proof, since S equals z\ + . . . + z m . ■ 

Lemma 14. There is a polynomial- size R°(lin) proof from (24) of 

(5 = 0)V---V(5 = n). 

Proof: For all j E [n], fix the abbreviation yj := x± j + . . . + x m j. Thus, by (24) we have 
(Uj = 0) V (yj = 1), for all j E [n]. Now the proof is similar to the proof of Lemma 8, except that 
here single variables are abbreviations of linear forms. 

If yi = then we can add y\ to the two sums in (y 2 = 0) V (y 2 = 1), and reach (y\ + y 2 = 
0) V (yi +2/2 = 1) and if y\ = 1 we can do the same and reach {y\ + y 2 = 1) V (y\ +y 2 = 2). So, by 
Lemma 5, we can derive with a polynomial-size R°(lin) proof 

(Vi + VI = 0) V (yi + y 2 = 1) V ( Vl + y 2 = 2) . (27) 
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Now, we consider the three cases in (27): yi + 1/2 = or yi + 1/2 = 1 or yi + 1/2 = 2, and the clause 
(?/3 = 0) V (7/3 = 1). We arrive in a similar manner at (yi + y2 + 2/3 = 0) V • • • V (y\ + 7/2 + 2/3 = 3). 
We continue in the same way until we arrive at (S = 0) V • • • V (S = n). m 

Theorem 15. There is a polynomial- size R°(lin) refutation of the m to n pigeonhole principle 

Proof: By Lemmas 13 and 14 above, all we need is to show a polynomial-size refutation of {S = 
m) v • • • V (S = m ■ n) and (S = 0) V • • • V (S = n). 

Since n < m, for all < k < n, if S = k then using the Resolution and Simplification rules we 
can cut-off all the sums in (S = m) V • • • V (S = m ■ n) and arrive at the empty clause. Thus, by 
Lemma 5, there is a polynomial-size R°(lin) proof of the empty clause from (S = 0) V • • • V (S = n) 
and (S = m) V • • • V (S = m • n). m 



6.2. Tseitin mod p Tautologies in R°(lin). This subsection establishes polynomial-size R°(lin) 
proofs of Tseitin graph tautologies (for constant degree graphs). This will allow us (in Section 9) 
to extend the multilinear proofs of the Tseitin mod p tautologies to any field of characteristic 
(the proofs in [RT06] required working over a field containing a primitive pth root of unity when 
proving the Tseitin mod p tautologies; for more details see Section 9). 

Tseitin mod p tautologies (introduced in [BGIP01]) are generalizations of the (original, mod 2) 
Tseitin graph tautologies (introduced in [Tse68]). To build the intuition for the generalized version, 
we start by describing the (original) Tseitin mod 2 principle. Let G = (V, E) be a connected 
undirected graph with an odd number of vertices n. The Tseitin mod 2 tautology states that there 
is no sub-graph G' = (V,E'), where E' C E, so that for every vertex v €V, the number of edges 
from E' incident to v is odd. This statement is valid, since otherwise, summing the degrees of all 
the vertices in G' would amount to an odd number (since n is odd), whereas this sum also counts 
every edge in E' twice, and so is even. 

As mentioned above, the Tseitin mod 2 principle was generalized by Buss et al. [BGIP01] to 
obtain the Tseitin mod p principle. Let p > 2 be some fixed integer and let G = (V, E) be a 
connected undirected r-regular graph with n vertices and no double edges. Let G' = (V, E') be the 
corresponding directed graph that results from G by replacing every (undirected) edge in G with 
two opposite directed edges. Assume that n = 1 (mod p). Then, the Tseitin mod p principle states 
that there is no way to assign to every edge in E' a value from {0, ... ,p — 1}, so that: 

(i) : For every pair of opposite directed edges e, e in £", with assigned values a, 6, respectively, 
a + b = (mod p); and 

(ii) : For every vertex v in V, the sum of the values assigned to the edges in E' coming out of 
v is congruent to 1 (mod p) . 

The Tseitin mod p principle is valid, since if we sum the values assigned to all edges of E' in 
pairs we obtain (mod p) (by (i)), where summing them by vertices we arrive at a total value of 1 
(mod p) (by (ii) and since n = 1 (mod p)). We shall see in what follows, that this simple counting 
argument can be carried on in a natural (and efficient) way already inside R°(lin). 

As an unsatisfiable propositional formula (in CNF form) the negation of the Tseitin mod p 
principle is formulated by assigning a variable x e ^ for every edge e £ E' and every residue i modulo 
p. The variable x e ^ is an indicator variable for the fact that the edge e has an associated value i. 
The following are the clauses of the Tseitin mod p CNF formula (as translated to disjunctions of 
linear equations). 

Definition 6.2 (Tseitin mod p formulas (-Tseitin^p)). Let p > 2 be some fixed integer and 
let G = (V, E) be a connected undirected r-regular graph with n vertices and no double edges, and 
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assume that n = 1 (mod p). Let G' = (V,E') be the corresponding directed graph that results 
from G by replacing every (undirected) edge in G with two opposite directed edges. 

Given a vertex v G V, denote the edges in E' coming out of v by e[v, 1], . . . , e[v, r] and define the 
following set of (translation of) clauses: 

{r r 1 

V ( x e[v,k],ik = °) G {0, 1} and ^i fe ^ 1 mod p L 

fe=l fe=i J 

The Tseitin mod p formula, denoted -iTseiting jP , consists of the following (translation) of clauses: 
p-i 

1. \J (x e>i = 1) , for all eeE' 
i=0 

(expresses that every edge is assigned at least one value from 0, . . . ,p — 1); 

2. (x e;i = 0) V (x e j = 0) , for alH ^ j G {0, . . . ,p- 1} and all e E E' 
(expresses that every edge is assigned at most one value from 0, ... ,p — 1); 

3. (x e> i = 1) V (x StP -i = 0) and {x e>i = 0) V (scg^-t = 1), 7 

for all two opposite directed edges e,e £ E' and alH G {0, . . . ,p — 1} 
(expresses condition (i) of the Tseitin mod p principle above); 

4. MOD Pi i(c), for all v G V 

(expresses condition (ii) of the Tseitin mod p principle above). 



Note that for every edge e G E' , the polynomials of (1,2) in Definition 6.2, combined with the 
Boolean axioms of R°(lin), force any collection of edge- variables x e q, . . . , x ejP _i to contain exactly 
one i G {0, ...,p — 1} so that x e> i = 1. Also, it is easy to verify that, given a vertex v G V, 
any assignment cr of 0, 1 values (to the relevant variables) satisfies both the disjunctions of (1,2) 
and the disjunctions of MOD p i(f) if and only if a corresponds to an assignment of values from 
{0, . . . ,p — 1} to the edges coming out of v that sums up to 1 (mod p). 

Until the rest of this subsection we fix an integer p > 2 and a connected undirected r-regular 
graph G = (V, E) with n vertices and no double edges, such that n = 1 mod p and r is a constant. 
As in Definition 6.2, we let G' = (V, E') be the corresponding directed graph that results from G 
by replacing every (undirected) edge in G with two opposite directed edges. We now proceed to 
refute -iTseiting iP inside R°(lin) with a polynomial-size (in n) refutation. 

Given a vertex v G V, and the edges in E' coming out of v, denoted e[v, 1], . . . , e[v, r], define the 
following abbreviation: 

r p— 1 

Uv := XIX/ ' X e[v,j],i- ( 28 ) 
j=l i=0 

Lemma 16. Let v G V be any vertex in G' . Then there is a constant-size R°(lin) proof from 
-iTseitiNg jP of the following disjunction: 

r-l 

\/{a v = l + £-p). (29) 

1=0 

Proof: Let T v C -iTSEiTlNc iP be the set of all disjunctions of the form (1,2,4) from Definition 6.2 
that contain only variables pertaining to vertex v (that is, all the variables x e i, where e G E' is an 
edge coming out of v, and i G {0, . . . ,p — 1}). 



If i = then Xg tP -i denotes Xe,o- 
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Claim 3. T v semantically implies (29), that is: 5 



r-l 



t v h \J(av = i + £-p) 



£=0 

Proof of claim: Let a be an assignment of 0, 1 values to the variables in T v that satisfies both the 
disjunctions of (1,2) and the disjunctions of MOD Pi i(u) in Definition 6.2. As mentioned above (the 
comment after Definition 6.2), such a o corresponds to an assignment of values from {0, . . . ,p — 1} 
to the edges coming out of v , that sums up to 1 mod p. This means precisely that a v = 1 mod p 
under the assignment a. Thus, there exists a nonnegative integer k, such that a v = 1 + kp under 
a. 

It remains to show that k < r — 1 (and so the only possible values that a v can get under a 
are 1, 1 + p, 1 + 2p, . . . , 1 + (r — l)p). Note that because a gives the value 1 to only one variable 
from x e i V) ji , . . . , x e \ v j] p _i (for every j G [r]), then the maximal value that a v can have under a is 
r{p — 1). Thus, 1 + kp < rp — r and so k < r — 1. ■ 

From Claim 3 and from the implicational completeness of R°(lin) (Corollary 12), there exists an 
R°(lin) derivation of (29) from T v . It remains to show that this derivation is of constant-size. 

Since the degree r of G' and the modulus p are both constants, both T v and (29) have constant 
number of variables and constant coefficients (including the free-terms). Thus, there is a constant- 
size R°(lin) derivation of (29) from T v . m 

Lemma 17. There is a polynomial- size (inn) R°(lin) derivation from -iTSElTIN^p of the following 
disjunction: 

(r-l)-n / \ 

\j lj2 a » = n + 1 ■ p ■ 

£=0 \v£V J 

Proof: Simply add successively all the equations pertaining to disjunctions (29), for all vertices 
v S V. Formally, we show that for every subset of vertices V C V, with |V| = k, there is a 
polynomial-size (in n) R°(lin) derivation from -iTseiting* jP of 

(r-l)-fe / \ 
V \Y,<Xv = k + ll-p\ , (30) 
£=0 Wv / 
and so putting V = V, will conclude the proof. 

We proceed by induction on the size of V. The base case, |V| = 1, is immediate from Lemma 16. 
Assume that we already derived (30) with a polynomial-size (in n) R°(lin) proof, for some V C V, 
such that |V| = k < n. Let u £ V \ V. By Lemma 16, we can derive 

r-l 

\J(a u = l + £-p) (31) 
e=o 

from -iTseitiNg^p with a constant-size proof. Now, by Lemma 6, each linear equation in (31) can 
be added to each linear equation in (30), with a polynomial-size (in n) R°(lin) proof. This results 
in the following disjunction: 

(r-l).(fc+l) / 

V Yl a v = k + l + £-p 

£=0 \v€_VU{u} 



Recall that we only consider assignments of 0, 1 values to variables when considering the semantic implication 
relation (=. 
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which is precisely what we need to conclude the induction step. ■ 

Lemma 18. Let e, e be any pair of opposite directed edges in G' and let i G {0, . . . ,p — 1}. Let 
T e C -iTseitiNg >p be the set of all disjunctions of the form (1,2,3) from Definition 6.2 that contain 
only variables pertaining to edges e, e (that is, all the variables x e j,Xej, for all j G {0, . . . ,p — 1} ). 
Then, there is a constant-size R°(lin) proof from T e of the following disjunction: 

(i ■ X eti + (p~i) ■ X E ,p-i = 0) V (i ■ x €t i + (p - i) ■ x S)P -i = p) ■ (32) 
Proof: First note that T e semantically implies 

(Xe,i %e,p—i = 0) V (x e j + Xgp_j = 2) . (33) 

The number of variables in T e and (33) is constant. Hence, there is a constant-size R°(lin)-proof 
of (32) from T e . Also note that 

(Xe,i ~\~ %e,p—i — 0) V {Xe,i ~\~ ^e,"p—i — 2) |= (QA\ 

(i ■ X e ,i + (p-i) ■ Xe,p-i = 0) V (i ■ X e> i + (p-i) ■ X s ,p-i = p) ■ 

Therefore, there is also an R°(lin)-proof of constant-size from T e of the lower line in (34). ■ 

We are now ready to complete the polynomial-size R°(lin) refutation of -iTseiting^. Using the 
two prior lemmas, the refutation idea is simple, as we now explain. Observe that 

^2 Ct v = ^ (* " X e,i + {P~i)- Xe,p-i) , (35) 

tlSV {e,e}CE' 

t6{0,...,p-l} 

where by {e, e} C E' we mean that e, e is pair of opposite directed edges in G' . 
Derive by Lemma 17 the disjunction 

(r-l)-n / \ 

\j {^2 a v = n + 1 ■ p ■ ( 36 ) 

This disjunction expresses the fact that ^2 v£ y ot v = 1 mod p (since n = 1 mod p). On the other 
hand, using Lemma 18, we can "sum together" all the equations (32) (for all {e, e} C E' and all 
i G {0, . . . ,p — 1}), to obtain a disjunction expressing the statement that 

^ (i ■ x e<i + (p-i) ■ Xe, p -i) = mod p . 

{e,e}CE' 
i€{0,...,p-l} 

By Equation (35), we then obtain the desired contradiction. This idea is formalized in the proof of 
the following theorem: 

Theorem 19. Let G = (V, E) be an r-regular graph with n vertices, where r is a constant. Fix 
some modulus p. Then, there are polynomial- size (in n) R°(lin) refutations of -iTSElTlN^p. 

Proof: First, use Lemma 17 to derive 

(r-l)-n / \ 
V [J2<xv = n + e-p\ . (37) 
i=o \vev / 

Second, use Lemma 18 to derive 

(i ■ x £t i + (p-i) ■ x E ,p-i = p) V (i- x e>i + (p-i) ■ x SlP -i = 0) , (38) 

for every pair of opposite directed edges in G' = (V, E') (as in Definition 6.2) and every residue 
i€{0,...,p-l}. 
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We now reason inside R°(lin). Pick a pair of opposite directed edges e, e and a residue i G 
{0, . . . ,p — 1}. If i ■ x e> i + (p — i) ■ Xe tP -i = 0, then subtract this equation successively from every 
equation in (37). We thus obtain a new disjunction, similar to that of (37), but which does not 
contain the x e ^ and Xg,p-« variables, and with the same free-terms. 

Otherwise, i-x e ^-\-(p—i)-Xe, p -i = p, then subtract this equation successively from every equation 
in (37). Again, we obtain a new disjunction, similar to that of (37), but which does not contain 
the x e ^i and £g,p-i variables, and such that p is subtracted from every free-term in every equation. 
Since, by assumption, n = 1 mod p, the free-terms in every equation are (still) equal 1 mod p. 

So overall, in both cases (i ■ x e ^ + (p — i) • x^ :P -i = and i ■ x e ^ + (p — i) • £g,p-i = p) we obtained 
a new disjunction with all the free-terms in equations equal 1 mod p. 

We now continue the same process for every pair e, e of opposite directed edges in G' and 
every residue i. Eventually, we discard all the variables x e i in the equations, for every e G E' 
and i G {0, . . . ,p — 1}, while all the free-terms in every equation remain to be equal 1 mod p. 
Therefore, we arrive at a disjunction of equations of the form (0 = 7) for some 7 = 1 mod p. 
By using the Simplification rule we can cut-off all such equations, and arrive finally at the empty 
disjunction. ■ 



6.3. The Clique-Coloring Principle in R(lin). In this section we observe that there are 
polynomial-size R(lin) proofs of the clique-coloring principle (for certain, weak, parameters). This 
implies, in particular, that R(lin) does not possess the feasible monotone interpolation property 
(see more details on the interpolation method in Section 7). 

Atserias, Bonet &; Esteban [ABE02] demonstrated polynomial-size Res(2) refutations of the 
clique-coloring formulas (for certain weak parameters; Theorem 20). Thus, it is sufficient to show 
that R(lin) polynomially-simulates Res(2) proofs (Proposition 2). This can be shown in a straight- 
forward manner. As noted in the first paragraph of Section 6, because the proofs of the clique- 
coloring formula we discuss here only follow the proofs inside Res(2), then in fact these proofs do 
not take any advantage of the capacity "to count" inside R(lin) (this capacity is exemplified, for 
instance, in Section 4.2). 

We start with the clique-coloring formulas (these formulas will also be used in Section 8). These 
formulas express the clique-coloring principle that has been widely used in the proof complexity 
literature (cf., [BPR97], [Pud97], [Kra97], [Kra98], [ABE02], [Kra07]). This principle is based on 
the following basic combinatorial idea. Let G = (V, E) be an undirected graph with n vertices and 
let k 1 < k be two integers. Then, one of the following must hold: 

(i) : The graph G does not contain a clique with k vertices; 

(ii) : The graph G is not a complete k' -partite graph. In other words, there is no way to 
partition G into k! subgraphs G\, . . . , Gy, such that every Gi is an independent set, and 
for all i 7^ j G [k'] , all the vertices in Gi are connected by edges (in E) to all the vertices in 

Obviously, if Item (ii) above is false (that is, if G is a complete A/-partite graph), then there 
exists a fe'-coloring of the vertices of G; hence the name clique- coloring for the principle. 

The propositional formulation of the (negation of the) clique-coloring principle is as follows. 
Each variable pij, for all i 7^ j G [n], is an indicator variable for the fact that there is an edge in 
G between vertex i and vertex j. Each variable qi : i, for all £ G [k] and all i € [n], is an indicator 
variable for the fact that the vertex i in G is the ^th vertex in the A:-clique. Each variable r^j, for 
all £ G \k'\ and all i G [n], is an indicator variable for the fact that the vertex i in G pertains to the 
independent set G#. 

Definition 6.3. The negation of the clique-coloring principle consists of the following unsatisfiable 
collection of clauses (as translated to disjunctions of linear equations), denoted -iCLiQUE^ fc/ : 
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(i) (q iA = 1) V • • • V (q iin = 1), for all £ G [k] 

(expresses that there exists at least one vertex in G which constitutes the £ih vertex of 
the fc-clique); 

(ii) = 0) V (q ej = 0), for all i / j £ [n], I £ [k] 

(expresses that there exists at most one vertex in G which constitutes the ith vertex of 
the /c-clique); 

(iii) (qe tl = 0) V {qff ti = 0), for all i £ [n], I + £' £ [k] 

(expresses that the zth vertex of G cannot be both the £th and the £'th vertex of the 
/c-clique); 

(iv) (q tji = 0) V faj = 0) V {pij = 1), for all £ + £' £ [k], i / j £ [n] 

(expresses that if both the vertices i and j in G are in the /c-clique, then there is an edge 
in G between i and j); 

(v) (n,i = 1) V • • • V (r Vti = 1), for all i £ [n] 

(expresses that every vertex of G pertains to at least one independent set); 

(vi) (r iti = 0) V (re ti = 0), for all £ ^ £ £ [k'],i £ [n] 

(expresses that every vertex of G pertains to at most one independent set); 

(vii) (pij = 0) V (r tji = 0) V (r t j = 0), for all i ^ j £ [n],t £ [k'] 

(expresses that if there is an edge between vertex i and j in G, then i and j cannot be 
in the same independent set); 

Remark 2. Our formulation of the clique-coloring formulas above is similar to the one used by 
[BPR97], except that we consider also the pij variables (we added the (iv) clauses and changed 
accordingly the (vii) clauses). This is done for the sake of clarity of the contradiction itself, and 
also to make it clear that the formulas are in the appropriate form required by the interpolation 
method (see Section 7 for details on the interpolation method). By resolving over the pij variables 
in (iv) and (vii) , one can obtain precisely the collection of clauses in [BPR97] . 

Atserias, Bonet & Esteban [ABE02] demonstrated polynomial-size (in n) Res(2) refutations of 
-■CLiQUEjJ fc/ , when k = y/n and k! = (logn) 2 /81oglogn. These are rather weak parameters, but 
they suffice to establish the fact that Res(2) does not possess the feasible monotone interpolation 
property. 

The Res(2) proof system (also called 2-DNF resolution), first considered in [KraOl], is resolution 
extended to operate with 2-DNF formulas, defined as follows. 

A 2-term is a conjunction of up to two literals. A 2-DNF is a disjunction of 2-terms. The size 
of a 2-term is the number of literals in it (that is, either 1 or 2). The size of a 2-DNF is the total 
size of all the 2-terms in it. 

Definition 6.4 (Res(2)). A Res(2) proof of a 2-DNF D from a collection K of 2-DNFs is a 
sequence of 2-DNFs Di,Z?2, . . . ,D S , such that D s = D, and every Dj is either from K or was 
derived from previous line(s) in the sequence by the following inference rules: 

Cut: Let A, B be two 2-DNFs. 

From Ay f\* =1 ij and BV\J i=l derive Ay B, where the Zj's are (not necessarily distinct) 
literals (and ->Zj is the negation of the literal li). 

AND-introduction: Let A,B be two 2-DNFs and h,fa two literals. 

From A V h and BVl 2 derive Av By A? =1 1 { . 
Weakening: From a 2-DNF A derive A V A? 1 /j , where the Zj's are (not necessarily dis- 
tinct) literals. 

A Rcs(2) refutation of a collection of 2-DNFs K is a Res(2) proof of the empty disjunction □ from 
K (the empty disjunction stands for false). The size of a Res(2) proof is the total size of all the 
2-DNFs in it. 
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Given a collection K of 2-DNFs we translate it into a collection of disjunctions of linear equations 
via the following translation scheme. For a literal I, denote by I the translation that maps a variable 
Xi into Xi, and —>Xi into 1 — Xj. A 2-term l\ A Z2 is first transformed into the equation I1+I2 = 2, and 
then moving the free-terms in the left hand side of l\ + 12 = 2 (in case there are such free-terms) 
to the right hand side; So that the final translation of l± A I2 has only a single free-term in the 
right hand side. A disjunction of 2-terms (that is, a 2-DNF) D = A ^,2) is translated into 

the disjunction of the translations of the 2-terms, denoted by D. It is clear that every assignment 
satisfies a 2-DNF D if and only if it satisfies D. 

Proposition 2. R(lin) polynomially simulates Res(2). In other words, if ir is a Res(2) proof of D 
from a collection of 2-DNFs Ki, . . . , K t , then there is an R(lin) proof of D from K\, . . . , K t whose 
size is polynomial in the size of it. 

The proof of Proposition 2 proceeds by induction on the length (that is, the number of proof- 
lines) in the Res(2) proof. This is pretty straightforward and similar to the simulation of resolution 
by R(lin), as illustrated in the proof of Proposition 1. We omit the details. 

Theorem 20 ([ABE02]). Let k = ^fn and k' = (logn) 2 /81oglogn. Then — 'Clique^ has Res(2) 
refutations of size polynomial in n. 

Thus, Proposition 2 yields the following: 

Corollary 21. Let k,k' be as in Theorem 20. Then -iCLlQUE^ fc; has R(lin) refutations of size 
polynomial in n. 

The following corollary is important (we refer the reader to Section A in the Appendix for the 
necessary relevant definitions concerning the feasible monotone interpolation property and to Section 
7 for explanation and definitions concerning the general [non- monotone] interpolation method). 

Corollary 22. R(lin) does not possess the feasible monotone interpolation property. 

Remark 3. The proof of -iCLiQUE^ fc , inside Res(2) demonstrated in [ABE02] (and hence, also 
the corresponding proof inside R(lin)) proceeds along the following lines. First reduce -iCLIQUe£ fc; 
to the k to k! pigeonhole principle. For the appropriate values of the parameters k and k' - 
and specifically, for the values in Theorem 20 — there is a short resolution proof of the k to k' 
pigeonhole principle (this was shown by Buss & Pitassi [BP97]); (this resolution proof is polynomial 
in the number of pigeons k, but not in the number of holes k', which is exponentially smaller than 
A;). 9 Therefore, in order to conclude the refutation of -iCLIQUeJ! fc , inside Res(2) (or inside R(lin)), it 
suffices to simulate the short resolution refutation of the k to k' pigeonhole principle. It is important 
to emphasize this point: After reducing, inside R(lin), -iCLiQUE^ fc/ to the pigeonhole principle, one 
simulates the resolution refutation of the pigeonhole principle, and this has nothing to do with 
the small-size R°(lin) refutations of the pigeonhole principle demonstrated in Section 6.1. This is 
because, the reduction (inside R(lin)) of -iCLiQUE^ fc , to the k to k' pigeonhole principle, results in 
a substitution instance of the pigeonhole principle formulas; in other words, the reduction results 
in a collection of disjunctions that are similar to the pigeonhole principle disjunctions where each 
original pigeonhole principle variable is substituted by some big formula (and, in particular, these 
disjunctions are not R°(lin)-lines at all). (Note that R°(lin) does not admit short proofs of the 
clique-coloring formulas as we show in Section 8.) 



'Whenever k > 2k' the k to k' pigeonhole principle is referred to as the weak pigeonhole principle. 
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7. Interpolation Results for R°(lin) 

In this section we study the applicability of the feasible (non-monotone) interpolation technique 
to R°(lin) refutations. In particular, we show that R°(lin) admits a polynomial (in terms of the 
R°(lin)-proofs) upper bound on the (non- monotone) circuit-size of interpolants. In the next section 
we shall give a polynomial upper bound on the monotone circuit-size of interpolants, but only in the 
case that the interpolant corresponds to the clique-coloring formulas (whereas, in this section we are 
interested in the general case; that is, upper bounding circuit-size of interpolants corresponding to 
any formula [of the prescribed type; see below]). First, we shortly describe the feasible interpolation 
method and explain how this method can be applied to obtain (sometime, conditional) lower bounds 
on proof size. Explicit usage of the interpolation method in proof complexity goes back to [Kra94]. 

Let Ai(p,q), i £ /, and Bj(p, f), j £ J, (I and J are sets of indices) be a collection of formulas (for 
instance, a collection of disjunctions of linear equations) in the displayed variables only. Denote by 
A{p, q) the conjunction of all Ai(p, q), i G I, and by B{p, r), the conjunction of all Bj(p, r), j £ J. 
Assume that p, q, r are pairwise disjoint sets of distinct variables, and that there is no assignment 
that satisfies both A(p, q) and B(p,r). Fix an assignment a to the variables in p. The p variables 
are the only common variables of the A^s and the Bj's. Therefore, either A(a, q) is unsatisfiable 
or B(a,r) is unsatisfiable. 

The interpolation technique transforms a refutation of A{p,q) A B(p,r), in some proof system, 
into a circuit (usually a Boolean circuit) separating those assignments a (for p) for which A{a, q) 
is unsatisfiable, from those assignments a for which B(a,f) is unsatisfiable (the two cases are not 
necessarily exclusive, so if both cases hold for an assignment, the circuit can output either that the 
first case holds or that the second case holds). In other words, given a refutation of A(p, q)AB(p, f), 
we construct a circuit C{p), called the interpolant, such that 

C(a) = 1 A(a, q) is unsatisfiable, and , , 

C(a) = B(a,r) is unsatisfiable. 

(Note that if U denotes the set of those assignments a for which A{a, q) is satisfiable, and V 
denotes the set of those assignments a for which B(a,r) is satisfiable, then U and V are disjoint 
[since A(p, q) A B(p,r) is unsatisfiable], and C(p) separates U from V; see Definition 7.2 below.) 

Assume that for a proof system V the transformation from refutations of A{p,q),B{p,r) into 
the corresponding interpolant circuit C(p) results in a circuit whose size is polynomial in the size 
of the refutation. Then, an exponential lower bound on circuits for which (39) holds, implies an 
exponential lower bound on ^-refutations of A{p, q),B{p,r). 

7.1. Interpolation for Semantic Refutations. We now lay out the basic concepts needed to 
formally describe the feasible interpolation technique. We use the general notion of semantic 
refutations (which generalizes any standard propositional refutation system). We shall use a close 
terminology to that in [Kra97]. 

Definition 7.1 (Semantic refutation). Let N be a fixed natural number and let E\, . . . Q 
{0, 1} , where Pli=i Ei = A semantic refutation from E\, . . . , E^ is a sequence D\, . . . , D m C 
{0, 1}^ with D m = and such that for every i £ [m], Di is either one of the or is deduced 
from two previous Dj,Di, 1 < j,£ < i, by the following semantic inference rule: 

• From A, B C {0, 1}^ deduce any C, such that C 3 (An B). 

Observe that any standard propositional refutation (with inference rules that derive from at 
most two proof-lines, a third line) can be regarded as a semantic refutation: just substitute each 
refutation- line by the set of its satisfying assignments; and by the soundness of the inference rules 
applied in the refutation, it is clear that each refutation-line (considered as the set of assignments 
that satisfy it) is deduced by the semantic inference rule from previous refutation- lines. 
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Definition 7.2 (Separating circuit). Let W,VC {0, l} n , where U n V = 0, be two disjoint sets. A 
Boolean circuit C with n input variables is said to separate U from V if C{x) = 1 for every x €U, 
and C{x) = for every x £ V. In this case we also say that IA and V are separated by C. 

Convention: In what follows we sometime identify a Boolean formula with the set of its satisfying 
assignments. 

Notation: For two (or more) binary strings u, v € {0,1}*, we write (u,v) to denote the concate- 
nation of the u with v (where v comes to the right of u, obviously). 

Let N = n+s+t be fixed from now on. Let At, . . . , A k C {0, l} n+s and let B u . . . , B t C {0, \} n+t . 
Define the following two sets of assignments of length n (formally, 0, 1 strings of length n) that 
can be extended to satisfying assignments of A\, . . . , A^ and B\, . . . ,B#, respectively (formally, 
those 0, 1 string of length n + s and n + t, that are contained in all A%, . . . , A^ and B\, ... , Bp, 
respectively) : 



Definition 7.3 (polynomial upper bounds on interpolants). Let V be a propositional 
refutation system. Assume that p, q, r are pairwise disjoint sets of distinct variables, where 
p has n variables, q has s variables and r has t variables. Let A\ (p, q) , . . . , Af. (p, q) and 
Bi(p,f), . . . , Bi(p,r) be two collections of formulas with the displayed variables only. Assume 
that for any such A±(p, q), . . . , A/-(p, q) and B\(p, r), . . . , Bi(p, r), if there exists a P-refutation of 
size S for A\(p, q) A • • • A Ak(p, q) A B%(p, r) A . . . A Bg(p, r) then there exists a Boolean circuit 
separating Ua from Vb of size polynomial in S. 10 In this case we say that V has a polynomial upper 
bound on interpolant circuits. 

7.1.1. The Communication Game Technique. The feasible interpolation via communication game 
technique is based on transforming proofs into Boolean circuits, where the size of the resulting 
circuit depends on the communication complexity of each proofdine. This technique goes back to 
[IPU94] and [Razb95] and was subsequently applied and extended in [BPR97] and [Kra97] ([IPU94] 
and [BPR97] did not use explicitly the notion of interpolation of tautologies or contradictions). We 
shall employ the interpolation theorem of Krajfcek in [Kra97], that demonstrates how to transform 
a small semantic refutation with each proofdine having low communication complexity into a small 
Boolean circuit separating the corresponding sets. 

The underlying idea of the interpolation via communication game technique is that a (semantic) 
refutation, where each proof-line is of small (that is, logarithmic) communication complexity, can be 
transformed into an efficient communication protocol for the Karchmer- Wigderson game (following 
[KW88]) for two players. In the Karchmer- Wigderson game the first player knows some binary 
string u G U and the second player knows some different binary string v £ V, where U and V are 
disjoint sets of strings. The two players communicate by sending information bits to one another 
(following a protocol previously agreed on). The goal of the game is for the two players to decide 
on an index i such that the ith bit of u is different from the ith bit of v. An efficient Karchmer- 
Wigderson protocol (by which we mean a protocol that requires the players to exchange at most a 
logarithmic number of bits in the worst-case) can then be transformed into a small circuit separating 




Here Ua and Vb are denned as above, by identifying the At(p, g)'s and the Bi(p,f)'s with the sets of assignments 
that satisfy them. 
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U from V (see Definition 7.2). This efficient transformation from protocols for Karchmer-Wigderson 
games (described in a certain way) into circuits, was demonstrated by Razborov in [Razb95]. So 
overall, given a semantic refutation with proof-lines of low communication complexity, one can 
obtain a small circuit for separating the corresponding sets. 

First, we need to define the concept of communication complexity in a suitable way for the 
interpolation theorem. 

Definition 7.4 (Communication complexity). Let N = n + s + t and A C {0, 1}^. Let u, v £ 

{0, l} n , q u £ {0, 1} S , r v £ {0, 1} . Denote by Ui, Vi the ith bit of u, v, respectively, and let (u, q u ,r v ) 
and (v,q u , r v ) denote the concatenation of strings u,q u ,r v and v,q u ,r v , respectively. Consider the 
following three tasks: 

(f) Decide whether (u,q u ,r v ) £ A; 

(2) Decide whether (v,q u ,r v ) £ A; 

(3) If one of the following holds: 

(i) (u, q u , r v ) £ A and (v, q u , r v ) A; or 

(ii) {u,q u ,r v ) A and (v,q u ,r v ) £ A, 
then find an i £ [n], such that U{ ^ Vf, 

Consider a game between two players, Player I and Player II, where Player I knows u £ {0, l} n , q u £ 
{0, 1} S and Player II knows v £ {0, l} n , r v £ {0, 1} . The two players communicate by exchanging 
bits of information between them (following a protocol previously agreed on). The communication 
complexity of A, denoted CC(A), is the minimal (over all protocols) number of bits that players I 
and II need to exchange in the worst-case in solving each of Tasks 1, 2 and 3 above. 11 

For A C {0, l} n+s define 

A : = {(a, b, c) | (a, b) £ A and c £ {0,1}*} , 
where a and b range over {0, l} n and {0, 1} S , respectively. Similarly, for B C {0, l} n+ * define 

B :={ (a, b, c) | (a, c) £ B and b £ {0, 1}*} , 
where a and c range over {0, l} n and {0, 1}', respectively. 

Theorem 23 ([Kra97]). Let Ay, ... , A k C {0, \} n+s and B x , . . . ,B t C {0, l} n+t . LetDx,...,D m 
be a semantic refutation from A±, . . . ,A^ and By, . . . , Bg. Assume that CC(Di) < for all i £ [m]. 
Then, the sets Ua and Vb (as defined above) can be separated by a Boolean circuit of size (m + 
n)2°«). 

In light of Theorem 23, to demonstrate that a certain propositional refutation system V possesses 
a polynomial upper bound on interpolant circuits (see Definition 7.3) it suffices to show that any 
proof-line of V induces a set of assignments with at most a logarithmic (in the number of variables) 
communication complexity (Definition 7.4). 

7.2. Polynomial Upper Bounds on Interpolants for R°(lin). Here we apply Theorem 23 to 
show that R°(lin) has polynomial upper bounds on its interpolant circuits. Again, in what follows 
we sometime identify a disjunction of linear equations with the set of its satisfying assignments. 

Theorem 24. R°(lin) has a polynomial upper bounds on interpolant circuits (Definition 7.3). 

According to the paragraph after Theorem 23, all we need in order to establish Theorem 24 is 
the following lemma: 

In other words, CC{A) is the minimal number £, for which there exists a protocol, such that for every input 
(u, q u to Player I and v, r v to Player II) and every task (from Tasks 1, 2 and 3), the players need to exchange at most 
£ bits in order to solve the task. 
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Lemma 25. Let D be an R°(lin)-/ine with N variables and let D be the set of assignments that 
satisfy D. 12 Then, CC(D) < O(logiV). 

Proof: Let N = n + s + t (and so D G {0, l} n+s+ *). For the sake of convenience we shall assume 
that the N variables in D are partitioned into (pairwise disjoint) three groups p := (pi . . . ,p n ), 
q:= (qi,...,q 8 ) and f := (n,...,r t ). Let u, v G {0,1}", q u G {0, 1} S , r v G {0,1}'. Assume that 
Player I knows u, q u and Player II knows v,r v . 

By the definition of an R°(lin)-line (see Definition 3.2) we can partition the disjunction D into 
a constant number of disjuncts, where one disjunct is a (possibly empty, translation of a) clause in 
the p, q, r variables (see Section 3.1), and all other disjuncts have the following form: 



where I is (an unbounded) set of indices, £i are integer numbers, for all i G I, and a, b, c denote 
vectors of n, s and t constant coefficients, respectively. 

Let us denote the (translation of the) clause from D in the p, q, r variables by 



where P, Q and R denote the (translated) sub-clauses consisting of the p, q and r variables, 
respectively. 

We need to show that by exchanging 0(log N) bits, the players can solve each of Tasks 1, 2 and 
3 from Definition 7.4, correctly. 

Task 1: The players need to decide whether (u,q u ,r v ) G D. Player II, who knows r v , computes 
the numbers c- r v , for every c pertaining to every disjunct of the form shown in Equation (40) 
above. Then, Player II sends the (binary representation of) these numbers to Player I. Since there 
are only a constantly many such numbers and the coefficients in every c are also constants, this 
amounts to O(logt) < O(logiV) bits that Player II sends to Player I. Player II also computes the 
truth value of the sub-clause R, and sends this (single-bit) value to Player I. 

Now, it is easy to see that Player I has sufficient data to compute by herself/himself whether 
(u,q u ,r v ) G D (Player I can then send a single bit informing Player II whether (u,q u ,r v ) G D). 
Task 2: This is analogous to Task 1. 

Task 3: Assume that (it, q u , r v ) G D and (v, q u , r v ) D (the case (u, q u , r v ) D and (v, q u ,r v ) G D 
is analogous). 

The first rounds of the protocol are completely similar to that described in Task 1 above: Player 
II, who knows r 1 ', computes the numbers c ■ r v , for every c pertaining to every disjunct of the form 
shown in Equation (40) above. Then, Player II sends the (binary representation of) these numbers 
to Player I. Player II also computes the truth value of the sub-clause R, and sends this (single-bit) 
value to Player I. Again, this amounts to O(logiV) bits that Player II sends to Player I. 

By assumption (that (u,q u ,r v ) G D and (v,q u ,r v ) D) the players need to deal only with the 
following two cases: 

Case 1: The assignment (u, q u , r v ) satisfies the clause PVQVR while (v, q u ,r v ) falsifies P\/Q Vi?. 
Thus, it must be that u satisfies the sub-clause P while v falsifies P. This means that for any i G [n] 
such that Ui sets to 1 a literal in P (there ought to exist at least one such i), it must be that u% ^ v%. 
Therefore, all that Player I needs to do is to send the (binary representation of) index i to Player 
II. (This amounts to 0(log N) bits that Player I sends to Player II.) 




(40) 



pyQyR 



The notation D has nothing to do with the same notation used in Section 3. 
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Case 2: There is some linear equation 



a ■ p + b ■ q + c- f = I (41) 

in D, such that a ■ u + b ■ q u + c ■ r v = i. Note that (by assumption that (v, q u , r v ) D) it must 
also hold that: a ■ v + b ■ q u + c • r v ^ I (and so there is an i £ [re], such that u% ^ Vj). Player I can 
find linear equation (41), as he/she already received from Player II all the possible values of c • r 
(for all possible c's in D). 

Recall that the left hand side of a linear equation d ■ x = I is called the linear form of the 
equation. By the definition of an R°(lin)-line there are only constant many distinct linear forms 
in D. Since both players know these linear forms, we can assume that each linear form has some 
index associated to it by both players. Player I sends to Player II the index of the linear form 
a-p+b-q + c- f from (41) in D. Since there are only constantly many such linear forms in D, it 
takes only constant number of bits to send this index. 

Now both players need to apply a protocol for finding an i 6 [n] such that ui / Vi, where 
a ■ u + b ■ q u + c • r v = £ and a ■ v + b ■ q u + c '• r v ^ i. Thus, it remains only to prove the following 
claim: 

Claim 4. There is a communication protocol in which Player I and Player II need at most 0(log N) 
bits of communication in order to find an i £ [n] such that ttj 7^ V{ (under the above conditions). 

Proof of claim: We invoke the well-known connection between Boolean circuit-depth and com- 
munication complexity. Let / : {0, 1}^ — ► {0, 1} be a Boolean function. Denote by dp(/) the 
minimal depth of a Boolean circuit computing /. Consider a game between two players: Player I 
knows some x G {0, 1}^ and Player II knows some other y G {0, 1}^, such that f(x) = 1 while 
f(y) = 0. The goal of the game is to find an i £ [N] such that x% 7^ y%. Denote by CC'(/) the 
minimal number of bits needed for the two players to communicate (in the worst case 13 ) in order 
to solve this game. 14 Then, for any function / it is known that dp(/) = CC'(/) (see [KW88]). 

Therefore, to conclude the proof of the claim it is enough to establish that the function / : 
{0, l} iV -> {0, 1} that receives the input variables p, q, f and computes the truth value of a • p + b • 
q + c-f=£ has Boolean circuit of depth O(logiV). In case all the coefficients in a, b, c are 1, it is 
easy to show 15 that there is a Boolean circuit of depth 0(log N) that computes the function /. In 
the case that the coefficients in a, b, c are all constants, it is easy to show, by a reduction to the 
case where all coefficients are 1, 6 that there is a Boolean circuit of depth 0(log N) that computes 
the function /. We omit the details. ■ ■ 



8. Size Lower Bounds 

In this section we establish an exponential-size lower bound on R°(lin) refutations of the clique- 
coloring formulas. We shall employ the theorem of Bonet, Pitassi &: Raz in [BPR97] that provides 
exponential-size lower bounds for any semantic refutation of the clique-coloring formulas, having 
low communication complexity in each refutation-line. 



Over all inputs x, y such that f(x) = 1 and f(y) — 0. 
14 The measure CC 1 is basically the same as CC defined earlier. 
15 Using the known 0(log 7V)-depth Boolean circuits for the threshold functions. 

^For instance, consider the simple case where we have only a single variable. That is, let c be a constant and 
assume that we wish to construct a circuit that computes c • x — £, for some integer £. Then, we take a circuit that 
computes the function / : {0, 1} C — > {0, 1} that outputs the truth value of yi + . . . + y c = I (thus, in / all coefficients 
are l's); and to compute c • x — £ we only have to substitute each yi in the circuit with the variable x. 
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First we recall the strong lower bound obtained by Alon & Boppana [AB87] (improving over 
[Razb85]; see also [And85]) for the (monotone) clique separator functions, defined as follows (a 
function / : {0, l} n — ► {0, 1} is called monotone if for all a E {0, 1}™, a' > a implies f{a') > /(a)): 

Definition 8.1 (Clique separator). A monotone boolean function Q kk > is called a clique separator 
if it interprets its inputs as the edges of a graph on n vertices, and outputs 1 on every input 
representing a /c-clique, and on every input representing a complete £/-partite graph (see Section 
6.3). 

Recall that a monotone Boolean circuit is a circuit that uses only monotone Boolean gates (for 
instance, only the fan-in two gates A, V). 

Theorem 26 ([AB87]). Let k,k' be integers such that 3 < k' < k and kvkf < n/(81ogn), then 
every monotone Boolean circuit that computes a clique separator function Q k k , requires size at least 

1 / n 
8 \&kyfkf logn 

For the next theorem, we need a slightly different (and weaker) version of communication com- 
plexity, than that in Definition 7.4. 

Definition 8.2 (Communication complexity (second definition)). Let X denote n Boolean variables 
x%, . . . , x n , and let Si, S2 be a partition of X into two disjoint sets of variables. The communication 
complexity of a Boolean function / : {0, l} n — > {0, 1} is the number of bits needed to be exchanged 
by two players, one knowing the values given to the Si variables and the other knowing the values 
given to S2 variables, in the worst-case, over all possible partitions S± and £2. 

Theorem 27 ([BPR97]). Every semantic refutation of-> clique^ k , (for k' < k) with m refutation- 
lines and where each refutation-line (considered as a the characteristic function of the line) has 
communication complexity (as in Definition 8.2) £, can be transformed into a monotone circuit of 
size m ■ 2 3 ^ +1 that computes a separating function Q kk /- 

In light of Theorem 26, in order to be able to apply Theorem 27 to R°(lin), and arrive at an 
exponential-size lower bound for R°(lin) refutations of the clique-coloring formulas, it suffices to 
show that R°(lin) proof- lines have logarithmic communication complexity: 

Lemma 28. Let D be an R°(lin)-/me with N variables. Then, the communication complexity (as 
in Definition 8.2) of D is at most O(logiV) (where D is identified here with the characteristic 
function of D ). 

Proof: The proof is similar to the proof of Lemma 25 for solving Task 1 (and the analogous Task 
2) in Definition 7.4. ■ 

By direct calculations we obtain the following lower bound from Theorems 26, 27 and Lemma 
28: 

Corollary 29. Let k be an integer such that 3 < k! = k — 1 and assume that | • n/(81ogra) < 
k\fk < n/(81ogn). Then, for all e < 1/3, every R°(lin) refutation of -iCLIQUE^ k , is of size at least 

2f2(n e ) 

When considering the parameters of Theorem 20, we obtain a super-polynomial separation be- 
tween R°(lin) refutations and R(lin) refutations, as described below. 
From Theorems 26,27 and Lemma 28 we have (by direct calculations): 
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Corollary 30. Let k = y/n and k' = (log n) 2 /8 log log n. Then, every R°(lin) refutation of 



-■CLIQUE^ k , has size at least n \ ^ lo s lo s™/ . 

By Corollary 21, R(lin) admits polynomial-size in n refutations of -iCLiQUE^ fc , under the param- 
eters in Corollary 30. Thus we obtain the following separation result: 

Corollary 31. R(lin) is super-polynomially stronger than R°(lin). 

Comment 1. Note that we do not need to assume that the coefficients in R°(lin)-lines are constants 
for the lower bound argument. If the coefficients in R°(lin)-lines are only polynomially bounded 
(in the number of variables) then the same lower bound as in Corollary 30 also applies. This is 
because R°(lin)-lines in which coefficients are polynomially bounded integers, still have low (that 
is, logarithmic) communication complexity (as in Definition 8.2). 



In this section we arrive at one of the main benefits of the work we have done so far; Namely, 
applying results on resolution over linear equations in order to obtain new results for multilinear 
proof systems. Subsection 9.1 that follows, contains definitions, sufficient for the current paper, 
concerning the notion of multilinear proofs introduced in [RT06] . 

9.1. Background on Algebraic and Multilinear Proofs. 

9.1.1. Arithmetic and Multilinear Formulas. 

Definition 9.1 (Arithmetic formula). Fix a field F. An arithmetic formula is a tree, with edges 
directed from the leaves to the root, and with unbounded (finite) fan-in. Every leaf of the tree 
(namely, a node of fan-in 0) is labeled with either an input variable or a field element. A field 
element can also label an edge of the tree. Every other node of the tree is labeled with either + or 
x (in the first case the node is a plus gate and in the second case a product gate). We assume that 
there is only one node of out-degree zero, called the root. The size of an arithmetic formula F is 
the total number of nodes in its graph and is denoted by \F\. An arithmetic formula computes a 
polynomial in the ring of polynomials F[sci, . . . , x n ] in the following way. A leaf just computes the 
input variable or field element that labels it. A field element that labels an edge means that the 
polynomial computed at its tail (namely, the node where the edge is directed from) is multiplied 
by this field element. A plus gate computes the sum of polynomials computed by the tails of all 
incoming edges. A product gate computes the product of the polynomials computed by the tails of 
all incoming edges. (Subtraction is obtained using the constant —1.) The output of the formula is 
the polynomial computed by the root. The depth of a formula F is the maximal number of edges 
in a path from a leaf to the root of F. 

We say that an arithmetic formula has a plus (resp., product) gate at the root if the root of the 
formula is labeled with a plus (resp., product) gate. 

A polynomial is multilinear if in each of its monomials the power of every input variable is at 
most one. 

Definition 9.2 (Multilinear formula). An arithmetic formula is a multilinear formula (or equiva- 
lently, multilinear arithmetic formula) if the polynomial computed by each gate of the formula is 
multilinear (as a formal polynomial, that is, as an element of F[xi, . . . , x n ]). 

An additional definition we shall need is the following linear operator, called the multilinearization 




9. Applications to Multilinear Proofs 



operator: 



32 



Definition 9.3 (Multilinearization operator). Given a field F and a polynomial q G F[xi, . . . ,x n ], 
we denote by M[g] the unique multilinear polynomial equal to q modulo the ideal generated by all 
the polynomials xf — x^, for all variables Xj. 

For example, if q = x\xi + ax\ (for some a G F) then M[g] = x\X2 + 0x4 . 

The simulation of R°(lin) by multilinear proofs will rely heavily on the fact that multilinear 
symmetric polynomials have small depth-3 multilinear formulas over fields of characteristic (see 
[SW01] for a proof of this fact). To this end we define precisely the concept of symmetric polyno- 
mials. 

A renaming of the variables x\, . . . ,x n is a permutation a G S n (the symmetric group on [n]) 
such that Xi is mapped to x a t{\ for every 1 < i < n. 

Definition 9.4 (Symmetric polynomial). Given a set of variables X = {x±, . . . ,x n }, a symmetric 
polynomial f over X is a polynomial in (all the variables of) X such that renaming of variables 
does not change the polynomial (as a formal polynomial). 

9.1.2. Polynomial Calculus with Resolution. Here we define the PCR proof system, introduced by 
Alekhnovich et al. in [ABSRW02]. 

Definition 9.5 (Polynomial Calculus with Resolution (PCR)). Let F be some fixed field and 
let Q := {Qi, ■ ■ ■ ,Q m } be a collection of multivariate polynomials from the ring of polynomials 
F[xi , . . . ,x n ,x 1 , . . . ,x n ]. The variables treated as new formal variables. Call the 

set of polynomials x 2 — x, for x G {x\, . . . , x n , x±, . . . , x n }, plus the polynomials Xj + Xj — 1, for 
all 1 < i < n, the set of Boolean axioms of PCR. A PCR proof from Q of a polynomial g is 
a finite sequence it = (pi, ■~,pe) of multivariate polynomials from F[xi, . . . ,x n ,xi, . . . ,x n ] (each 
polynomial pi is interpreted as the polynomial equation pi = 0) , where pt = g and for each i G [•£] , 
either pj = Qj for some j G [m], or is a Boolean axiom, or p, was deduced from Pj,Pk , where 
j, k < i, by one of the following inference rules: 

Product: From p deduce Xi ■ p , for some variable Xi ; 

From p deduce Xi ■ p , for some variable Xj ; 
Addition: From p and q deduce a ■ p + (3 ■ q, for some a, (3 G F. 

A PCi? refutation of Q is a proof of 1 (which is interpreted as 1 = 0) from Q. The number of 
steps in a PCR proof is the number of proof-lines in it (that is, I in the case of n above). 

Note that the Boolean axioms of PCR have only 0, 1 solutions, where x% = if X{ = 1 and x% = 1 
if Xi = 0. 

9.1.3. Multilinear Proof Systems. In [RT06] the authors introduced a natural (semantic) algebraic 
proof system that operates with multilinear arithmetic formulas denoted fMC (which stands for 
formula multilinear calculus), defined as follows: 

Definition 9.6 (Formula Multilinear Calculus (fMC)). Fix a field F and let Q := {Q u Q m } be 
a collection of multilinear polynomials from ¥[x±, . . . , x n , x±, . . . , x n ] (the variables x\, . . . , x n are 
treated as formal variables). Call the set of polynomials consisting of Xj + Xj — 1 and Xj • Xj for 
1 < i < n , the Boolean axioms of fMC. An fMC proof from Q of a polynomial g is a finite sequence 
7T = (pi, ...,pi) of multilinear polynomials from F[xi, . . . , x n ,x±, . . . , x n ] , such that pi = g and for 
each i G [£], either pj = Qj for some j G [m] , or p,; is a Boolean axiom of /MC, or pi was deduced 
by one of the following inference rules using pj,Pk for j, k < i: 

Product: from p deduce q • p , for some polynomial g G F[xi, . . . , x n , x±, . . . , x n ] such that 
p ■ q is multilinear; 

Addition: from p, q deduce a ■ p + f3 ■ q, for some a, J3 G F. 
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All the polynomials in an fMC proof are represented as multilinear formulas. (A polynomial pi in 
an fMC proof is interpreted as the polynomial equation pi = 0.) An fMC refutation of Q is a proof 
of 1 (which is interpreted as 1 = 0) from Q. The size of an fMC proof tt is defined as the total 
sum of all the formula sizes in tt and is denoted by |tt|. 

Note that the Boolean axioms have only 0, 1 solutions, where x^ = if x^ = 1 and Xi = 1 if 
Xi = 0, for each 1 < % < n . 

Definition 9.7 (Depth-fc Formula Multilinear Calculus (depth-A; fMC)). For a natural number k, 
depth-k fMC denotes a restriction of the fMC proof system, in which proofs consist of multilinear 
polynomials from ¥[x±, . . . ,x n ,x±, . . . , x n ] represented as multilinear formulas of depth at most k. 

9.2. From R(lin) Proofs to PCR Proofs. We now demonstrate a general and straightforward 
translation from R(lin) proofs into PCR proofs over fields of characteristic 0. We use the term 
"translation" in order to distinguish it from a simulation; since here we are not interested in the 
size of PCR proofs. In fact we have not defined the size of PCR proofs at all. We shall be interested 
only in the number of steps in PCR proofs. 

From now on, all polynomials and arithmetic formulas are considered over some fix field ¥ of 
characteristic 0. Recall that any field of characteristic contains (an isomorphic copy of) the 
integer numbers, and so we can use integer coefficients in the field. 

Definition 9.8 (Polynomial translation of R(lin) proof-lines). Let D be a disjunction of linear 
equations: 

(a^X! + ... + a^x n = V • • • V (a^ Xl + ... + a®x n = aj } ) . (42) 
We denote by D its translation into the following polynomial: 17 

(a^ ) x 1 + ... + a^x n - 4 1} ) • • • (afxi + ... + a®x n - a?) . (43) 

If D is the empty disjunction, we define D to be the polynomial 1. 

It is clear that every 0, 1 assignment to the variables in D, satisfies D, if and only if D evaluates 
to under the assignment. 

Proposition 3. Let tt = (Di, . . . , Dg) be an R(lin) proof sequence of D^, from some collection 
of initial disjunctions of linear equations Q\, . . . , Q m . Then, there exists a PCR proof of from 
Qi, . . . , Q m with at most a polynomial in \ir\ number of steps. 

Proof: We proceed by induction on the number of lines in tt. 

The base case is the translation of the axioms of R(lin) via the translation scheme in Definition 
9.8. An R(lin) Boolean axiom (xj = 0) V (xi = 1) is translated into Xi ■ (xi — 1) which is already a 
Boolean axiom of PCR. 

For the induction step, we translate every R(lin) inference rule application into a polynomial-size 
PCR proof sequence as follows. We use the following simple claim: 

Claim 5. Let p and q be two polynomials and let s be the minimal size of an arithmetic formula 
computing q. Then one can derive in PCR, with only a polynomial in s number of steps, from p 
the product q • p. 18 

Proof of claim: By induction on s. m 

1 1 

This notation should not be confused with the same notation in Section 6.3. 
18 A 

gain, note that we only require that the number of steps in the proof is polynomial. We do not consider here 
the size of the PCR proof. 
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Assume that Di = Dj V L was derived from Dj using the Weakening inference rule of R(lin), 
where j < i < I and L is some linear equation. Then, by Claim 5, Di = Dj ■ L can be derived from 
Dj with a derivation of at most polynomial in \Dj V L\ many steps. 

Assume that Di was derived from Dj where Dj is Di V (0 = k), using the Simplification inference 
rule of R(lin), where j < i < £ and A: is a non-zero integer. Then, Di can be derived from 
Dj = Di • —k by multiplying with —k~ l (via the Addition rule of PCR). 

Thus, it remains to simulate the resolution rule application of R(lin). Let A,B be two disjunc- 
tions of linear equations and assume that A V B V ((a + b) ■ x = ao + bo) was derived in ir from 
A V (a • x = ao) and BV (b - x = bo) (the case where A V B V ((a — b) • x = ao — 60) was derived from 
A V (a ■ x = ao) and By (b ■ x = bo), is similar). 

We need to derive A - B ■ ((a + b) ■ x — ao — bo) from A • (a • x — ao) and B • (b • x — bo). This is 
done by multiplying A ■ (a • x — ao) with B and multiplying B ■ (b - x — bo) with A (using Claim 5), 
and then adding the resulted polynomials together. ■ 

Remark 4. When translating R(lin) proofs into PCR proofs we actually do not make any use of 
the "negative" variables x±, . . . ,x n . Nevertheless, the multilinear proof systems make use of these 
variables in order to polynomially simulate PCR proofs (see Theorem 33 and its proof in [RT06]). 

We shall need the following corollary in the sequel: 

Corollary 32. Let ir = Di, . . . , D^ be an R°(lin) proof of Dp, and let s be the maximal size of an 
R°(lin)-/me in ir. Then there is a PCR proof ir' of Di with polynomial- size in \ir\ number of steps 
and such that every line of it' is a translation (via Definition 9.8) of an R°(lin)-/ine (Definition 
3.2), where the size of the R°(lin)-/ine is polynomial in s. 

Proof: The simulation of R(lin) by PCR shown above, can be thought of as, first, considering 
D\, . . . , Di as the "skeleton" of a PCR proof of D^. And second, for each Di that was deduced by 
one of R(lin)'s inference rules from previous lines, one inserts the corresponding PCR proof sequence 
that simulates the appropriate inference rule application (as described in the proof of Proposition 
3). By definition, those PCR proof- lines that correspond to lines in the skeleton D±, . . . ,Di are 
translations of R°(lin)-lines (with size at most polynomial in s). Thus, to conclude the proof of 
the corollary, one needs only to check that for any R°(lin)-line Di that was deduced by one of 
R(lin)'s inference rules from previous R°(lin)-lines (as demonstrated in the proof of Proposition 3), 
the inserted corresponding PCR proof sequence uses only translations of R°(lin)-lines (with size 
polynomial in s). This can be verified by a straightforward inspection. ■ 

9.3. From PCR Proofs to Multilinear Proofs. We now recall the general simulation result 
proved in [RT06] stating the following: Let ir be a PCR refutation of some initial collection of 
multilinear polynomials Q over some fixed field. Assume that ir has polynomially many steps (that 
is, the number of proof lines in the PCR proof sequence is polynomial). If the 'multilinearization' 
(namely, the result of applying the M[-] operator - see Definition 9.3) of each of the polynomials 
in 7r has a polynomial-size depth d multilinear formula (with a plus gate at the root), then there is 
a polynomial-size depth-d fMC refutation of Q. More formally, we have: 

Theorem 33 ([RT06]). Fix a field F (not necessarily of characteristic 0) and let Q be a set of 
multilinear polynomials from ¥[x±, . . . , x n , x±, . . . , x n ]. Let ir = (pi, . . . ,p m ) be a PCR refutation 
of Q. For each pi £ it, let <I>j be a multilinear formula for the polynomial M[pi]. Let s be the total 
size of all formulas <3>j, that is, s = S™ =1 |<l>j|, and let d > 2 be the maximal depth of all formulas 
<3?j. Assume that the depth of all the formulas <3?j that have a product gate at the root is at most 
d — 1. Then there is a depth-d fMC refutation of Q of size polynomial in s. 
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9.3.1. Depth-3 Multilinear Proofs. Here we show that multilinear proofs operating with depth-3 
multilinear formulas (that is, depth-3 fMC) over fields of characteristic polynomially simulate 
R°(lin) proofs. In light of Proposition 32 and Theorem 33, to this end it suffices to show that any 
R°(lin)-line D translates into a corresponding polynomial p (via the translation in Definition 9.8) 
such that M[p] has a multilinear formula of size polynomial (in the number of variables) and depth 
at most 3 (with a plus gate at the root) over fields of characteristic 0. 
We need the following proposition from [RT06]: 

Proposition 4 ([RT06]). Let ¥ be a field of characteristic 0. For a constant c, let X%, . . . ,X C 

be c finite sets of variables (not necessarily disjoint), where S° =1 |JQ| = n. Let fi,...,f c be c 
symmetric polynomials over X±, . . . , X c (over the field ¥), respectively. Then, there is a depth-3 
multilinear formula for M[/i • • • f c ] of size polynomial (in n), with a plus gate at the root. 

The following is the key lemma of the simulation: 

Lemma 34. Let D be an R°(lin)-/me with n variables and let p = D (see Definition 9.8). Then, 
M[p] has a depth-3 multilinear formula over fields of characteristic 0, with a plus gate at the root 
and size at most polynomial in the size of D. 

Proof: Assume that the underlying variables of D By the definition of an 

R°(lin)-line (see Definition 3.2) we can partition the disjunction D into a constant number of 
disjuncts, where one disjunct is a (possibly empty, translation of a) clause C, 19 and all other 
disjuncts have the following form: 

m 

\J(a-x = £ l ), (44) 
i=i 

where the £j's are integers, m is not necessarily bounded and a denotes a vector of n constant 
integer coefficients. 

Let us denote by q the polynomial representing the clause C. 20 

Consider a disjunct as shown in (44). Since the coefficients a are constants, a- x can be written 
as a sum of constant number of linear forms, each with the same constant coefficient. In other 
words, a ■ x can be written as z\ + . . . + Zd, for some constant d, where for all i € [d]: 

Zi'—b-'Y] Xj , (45) 

for some J C [re] and some constant integer b. We shall assume without loss of generality that d is 
the same constant for every disjunct of the form (44) inside D (otherwise, take d to be the maximal 
such d). 

Thus, (44) is translated (via the translation scheme in Definition 9.8) into: 

m 

+ ... + ** -4). (46) 

i=l 

By fully expanding the product in (46), we arrive at: 

E ( «r*,i • II Z k) > ( 47 ) 
n+...+rd+i=m \ k=l / 



l^If there is more than one clause in D, we simply combine all the clauses into a single clause. 

2( ^C is a translation of a clause (that is, disjunction of literals) into a disjunction of linear equations, as defined 
in Section 3.1. The polynomial q is then the polynomial translation of this disjunction of linear equations, as in 
Definition 9.8. 
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where the r^'s are non-negative integers, and where the a r 's, for every < r < m are just integer 
coefficients, formally defined as follows (this definition is not essential; we present it only for the 
sake of concreteness) : 

UC[m] jeu 
\U\=r 

Claim 6. The polynomial D (the polynomial translation of D) is a linear combination (over F) of 
polynomially (in \D\) many terms, such that each term can be written as 



n 



'k 



Z k 
keK 

where K is a collection of a constant number of indices, r^'s are non- negative integers, and the z^s 
and q are as above (that is, the z^s are linear forms, where each z^ has a single coefficient for all 
variables in it, as in (45), and q is a polynomial translation of a clause). 

Proof of claim: Denote the total number of disjuncts of the form (44) in D by h. By definition 
(of R°(lin)-line), h is a constant. Consider the polynomial (47) above. In D, we actually need to 
multiply h many polynomials of the form shown in (47) and the polynomial q. 

For every j S [h] we write the (single) linear form in the jth disjunct as a sum of constantly 
many linear forms Zj i + . . . + Zj d, where each linear form Zj k has the same coefficient for every 
variable in it. Thus, D can be written as: 



n 



n+...+r d+1 = 



V 




(49) 



(where the rrij's are not bounded, and the coefficients 

OtVci+i 9*T€i ELS defined in (48) except that here 
we add the index (j) to denote that they depend on the jih disjunct in D). Denote the maximal 
rrij, for all j G [h], by tuq. The size of D, denoted \D\, is at least ttiq. Note that since d is a 
constant, the number of summands in each (middle) sum in (49) is polynomial in tuq, which is at 
most polynomial in \D\. Thus, by expanding the outermost product in (49), we arrive at a sum of 
polynomially in \D\ many summands. Each summand in this sum is a product of h terms of the 
form (★) multiplied by q. m 

It remains to apply the multilinearization operator (Definition 9.3) on D, and verify that the re- 
sulting polynomial has a depth-3 multilinear formula with a plus gate at the root and of polynomial- 
size (in \D\). Since M[-] is a linear operator, it suffices to show that when applying M[-] on each 
summand in D, as described in Claim 6, one obtains a (multilinear) polynomial that has a depth-3 
multilinear formula with a plus gate at the root, and of polynomial-size in the number of variables 
n (note that clearly n < |-D|). This is established in the following claim: 

Claim 7. The polynomial M[g • rifce-ft' 2 '/?] has a depth-3 multilinear formula of polynomial-size 
in n (the overall number of variables) and with a plus gate at the root (over fields of characteristic 
0), under the same notation as in Claim 6. 

Proof of claim: Recall that a power of a symmetric polynomial is a symmetric polynomial in itself. 
Since each Zk (for all k G K) is a symmetric polynomial, then its power z r k k is also symmetric. The 
polynomial q is a translation of a clause, hence it is a product of two symmetric polynomials: the 
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symmetric polynomial that is the translation of the disjunction of literals with positive signs, and 
the symmetric polynomial that is the translation of the disjunction of literals with negative signs. 
Therefore, q ■ YlkeK ^k * s a P r °duct of constant number of symmetric polynomials. By Proposition 
4, M[g • rifeeif z k h ] ( wnere here the M[-] operator operates on the x variables in the z^'s and q) is 
a polynomial for which there is a polynomial-size (in n) depth-3 multilinear formula with a plus 
gate at the root (over fields of characteristic 0). ■ ■ 

We now come to the main corollary of this section. 

Corollary 35. Multilinear proofs operating with depth-3 multilinear formulas (that is, depth-3 fMC 
proofs) polynomially- simulate R°(lin) proofs. 

Proof: Immediate from Corollary 32, Theorem 33 and Proposition 34. 

For the sake of clarity we repeat the chain of transformations needed to prove the simulation. 
Given an R°(lin) proof 7r, we first use Corollary 32 to transform it into a PCR proof 7r' , with number 
of steps that is at most polynomial in \tt\, and where each line in it' is a polynomial translation of 
some R°(lin)-line with size at most polynomial in the maximal line in ir (which is clearly at most 
polynomial in |7r|). Thus, by Proposition 34 each polynomial in ir' has a corresponding multilinear 
polynomial with a polynomial-size in |7r| depth-3 multilinear formula (and a plus gate at the root). 
Therefore, by Theorem 33, we can transform ir 1 into a depth-3 fMC proof with only a polynomial 
(in |7r|) increase in size. ■ 

9.4. Small Depth-3 Multilinear Proofs. Since R°(lin) admits polynomial-size (in n) refutations 
of the m to n pigeonhole principle (for any m > n) (as defined in 6.1), Corollary 35 and Theorem 
15 yield: 

Theorem 36. For any m > n there are polynomial- size (in n) depth-3 fMC refutations of the m 
to n pigeonhole principle PHP™ (over fields of characteristic 0). 

This improves over the result in [RT06] that demonstrated a polynomial-size (in n) depth-3 fMC 
refutations of a weaker principle, namely the m to n functional pigeonhole principle. 
Furthermore, corollary 35 and Theorem 19 yield: 

Theorem 37. Let G be an r-regular graph with n vertices, where r is a constant, and fix some 
modulus p. Then there are polynomial- size (inn) depth-3 fMC refutations of Tseitin mod p formulas 
-iTseiting iP (over fields of characteristic 0). 

The polynomial-size refutations of Tseitin graph tautologies here are different than those demon- 
strated in [RT06]. Theorem 37 establishes polynomial-size refutations over any field of characteristic 
of Tseitin mod p formulas, whereas [RT06] required the field to contain a primitive pth. root of 
unity. On the other hand, the refutations in [RT06] of Tseitin mod p formulas do not make any use 
of the semantic nature of the fMC proof system, in the sense that they do not utilize the fact that 
the base field is of characteristic (which in turn enables one to efficiently represent any symmetric 
[multilinear] polynomial by a depth-3 multilinear formula). 

10. Relations with Extensions of Cutting Planes 

In this section we tie some loose ends by showing that, in full generality, R(lin) polynomially 
simulates R(CP) with polynomially bounded coefficients, denoted R(CP*). First we define the 
R(CP*) proof system - introduced in [Kra98] - which is a common extension of resolution and 
CP* (the latter is cutting planes with polynomially bounded coefficients). The system R(CP*), 
thus, is essentially resolution operating with disjunctions of linear inequalities (with polynomially 
bounded integral coefficients) augmented with the cutting planes inference rules. 
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A linear inequality is written as 

a ■ x > ao , (50) 
where a is a vector of integral coefficients a\,. . . , a n , if is a vector of variables x\, . . . , x n , and ao 
is an integer. The size of the linear inequality (50) is the sum of all ao, . . . ,a n written in unary 
notation (this is similar to the size of linear equations in R(lin)). A disjunction of linear inequalities 
is just a disjunction of inequalities of the form in (50). The semantics of a disjunction of inequalities 
is the natural one, that is, a disjunction is true under an assignment of integral values to x if and 
only if at least one of the inequalities is true under the assignment. The size of a disjunction of 
linear inequalities is the total size of all linear inequalities in it. We can also add in the obvious 
way linear inequalities, that is, if L\ is the linear inequality a-x > ao and L 2 is the linear inequality 
b ■ x > bo, then L\ + L 2 is the linear inequality (a + b) ■ x > ao + bo. 

The proof system R(CP*) operates with disjunctions of linear inequalities with integral coeffi- 
cients (written in unary representation), and is defined as follows (our formulation is similar to 
that in [Koj07]): 21 

Definition 10.1 (R(CP*)). Let K := {K\, . . . ,K m } be a collection of disjunctions of linear in- 
equalities (whose coefficients are written in unary representation). An R(CP*)-proof from K of a 
disjunction of linear inequalities D is a finite sequence tt = {D\ } Dp) of disjunctions of linear 
inequalities, such that Dg = D and for each i G [£]: either Di = Kj for some j € [m]; or Di is one 
of the following R(CP*)-axioms: 

(1) Xi > 0, for any variable xf, 

(2) — Xi > — 1, for any variable xf, 

(3) (a ■ x > ao) V (—a ■ x > 1 — ao), where all coefficients (including ao) are integers; 
or Di was deduced from previous lines by one of the following R(CP*)-inference rules: 

(1) Let A, B be two disjunctions of linear inequalities and let L±, L 2 be two linear inequalities. 22 
From A V Li and B V L 2 derive AVBV {L\ + L 2 ). 

(2) Let L be some linear equation. 

From a disjunction of linear equations A derive Ay L. 

(3) Let A be a disjunction of linear equations 

From A V (0 > 1) derive A. 

(4) Let c be a non-negative integer. 

From (a • x > oq) V A derive (ca • x > cao) V A. 

(5) Let A be a disjunction of linear inequalities, and let c > 1 be an integer. 

From (ca • x > ao) V A derive (a • x > [ao/c] ) V A. 
An R(CP*) refutation of a collection of disjunctions of linear inequalities K is a proof of the empty 
disjunction from K. The size of a proof tt in R(CP*) is the total size of all the disjunctions of 
linear inequalities in tt, denoted |7r|. 

In order for R(lin) to simulate R(CP*) proofs, we need to fix the following translation scheme. 
Every linear inequality L of the form a-x > ao is translated into the following disjunction, denoted 
L: 

(a ■ x = ao) V (a • x = ao + 1) V • • • V (a • x = ao + k) , (51) 
where k is such that ao + k equals the sum of all positive coefficients in a, that is, ao + k = 

max (a • x) (in case the sum of all positive coefficients in a is less than ao, then we put k = 0). 

xe{o,i} n 

An inequality with no variables of the form > ao is translated into = ao in case it is false (that 

When we allow coefficients to be written in binary representation, instead of unary representation, the resulting 
proof system is denoted R(CP). 

22 In all R(CP*)-inference rules, A, B are possibly the empty disjunctions. 
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is, in case < ao), and into = in case it is true (that is, in case > oo). Note that since the 
coefficients of linear inequalities (and linear equations) are written in unary representation, any 
linear inequality of size s translates into a disjunction of linear equations of size 0(s 2 ). Clearly, 
every 0, 1 assignment to the variables x satisfies L if and only if it satisfies its translation L. A 
disjunction of linear inequalities D is translated into the disjunction of the translations of all the 
linear inequalities in it, denoted D. A collection K := {K±, . . . , K m } of disjunctions of linear 

inequalities, is translated into the collection |-fTi, . . . , if m V 

Theorem 38. R(lin) polynomially- simulates R(CP*). In other words, if it is an R(CP*) proof of 
a linear inequality D from a collection of disjunctions of linear inequalities K±, . . . , Kf, then there 
is an R(lin) proof of D from K\, . . . ,Kt whose size is polynomial in \ir\. 

Proof: By induction on the number of proof-lines in tt. 

Base case: Here we only need to show that the axioms of R(CP*) translates into axioms of 
R(lin), or can be derived with polynomial-size (in the size of the original R(CP*) axiom) R(lin) 
derivations (from R(lin)'s axioms). 

R(CP*) axiom number (1): Xj > translates into the R(lin) axiom (xj = 0) V (xi = 1). 

R(CP*) axiom number (2): — Xj > —1, translates into (— Xj = —1) V (— Xj = 0). From the 
Boolean axiom (xj = 1) V (xj = 0) of R(lin), one can derive with a constant-size R(lin) proof the 
line {—Xi = —1) V (— Xj = 0) (for instance, by subtracting twice each equation in (xj = 1) V (xj = 0) 
from itself). 

R(CP*) axiom number (3): (a • x > ao) V (— a- x > 1 — ao). The inequality (a • x > ao) translates 
into 

h 

\/ (a ■ x = b) , 

b=a 

where h is the maximal value of a • x over 0, 1 assignments to x (that is, h is just the sum of all 
positive coefficients in a). The inequality (—a - x > 1 — ao) translates into 

/ 

V (-a-x = b), 

6=1— ao 

where / is the maximal value of — a ■ x over 0, 1 assignments to x (that is, / is just the sum of 
all negative coefficients in a) . Note that one can always flip the sign of any equation a ■ x = b in 
R(lin). This is done, for instance, by subtracting twice a - x = b from itself. So overall R(CP*) 
axiom number (3) translates into 

h f 
\J (a-x = b)V \J (-a-x = b), 

b=ao 6=1— ao 

that can be converted inside R(lin) into 

ao — 1 h 

\J (a- x = b)y \J (a- x = b) . (52) 

b=-f 6=ao 

Let A' := {— /, — / + 1, . . . , ao — 1, ao, ao + 1, • • • , a} and let A be the set of all possible values that 
a ■ x can get over all possible Boolean assignments to x. Notice that A C A' . By Lemma 8, for any 
a ■ x, there is a polynomial-size (in the size of the linear form a • x) derivation of \J ae _^(a ■ x = a). 
By using the R(lin) Weakening rule we can then derive Vae-A'^ " x = a) which is equal to (52). 

Induction step: Here we simply need to show how to polynomially simulate inside R(lin) every 
inference rule application of R(CP*). 
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Rule (1): Let A, B be two disjunctions of linear inequalities and let Li, L2 be two linear inequalities. 
Assume we already have a R(lin) proofs of A V L\ and B V L2. We need to derive A V B V L\ + L2. 
Corollary 7 shows that there is a polynomial-size (in the size of L\ and L2; which is polynomial 
in the size of L\ and L2) derivation of L\ + L2 from L\ and L2, from which the desired derivation 
immediately follows. 

Rule (2): The simulation of this rule in R(lin) is done using the R(lin) Weakening rule. 
Rule (3): The simulation of this rule in R(lin) is done using the R(lin) Simplification rule (remem- 
ber that > 1 translates into = 1 under our translation scheme). 

Rule (4): Let c be a non-negative integer. We need to derive (ca • x > cao)\/ A from (a ■ x > ao) \/A 
in R(lin). This amounts only to "adding together" c times the disjunction (a • x > ao) in 
(a • x > ao) V A. This can be achieved by c many applications of Corollary 7. We omit the details. 
Rule (5): We need to derive (a • x > \ao/c\) V A, from (ca • x > ao) V A. Consider the disjunction 
of linear equations (ca ■ x > ao), which can be written as: 

(ca • x = ao) V (ca • x = ao + 1) V . . . V (ca • x = ao + r) , (53) 

where ao + r is the maximal value ca ■ x can get over 0, 1 assignments to x. By Lemma 8 there is a 
polynomial-size (in the size of a ■ x) R(lin) proof of 

\J (a- x = a) , (54) 

where A is the set of all possible values of a ■ x over 0, 1 assignments to x. 

We now use (53) to cut-off from (54) all equations (a ■ x = (3) for all (3 < [ao/c] (this will give 
us the desired disjunction of linear equations). Consider the equation (a ■ x = (3) \n (54) for some 
fixed [3 < [ao/c] . Use the resolution rule of R(lin) to add this equation to itself c times inside (54). 
We thus obtain 

(ca ■ x = cj3) V \l (a • x = a) . (55) 

aeA\{/3} 

Since (3 is an integer and (3 < [ao/c], we have c(3 < a^. Thus, the equation (ca • x = cj3) does 
not appear in (53). We can then successively resolve (ca ■ x = c(3) in (55) with each equation 
(ca • x = ao), • • • , (ca • x = ao + r) in (53). Hence, we arrive at V q; g^i\{ j s} (S ■ x = a). Overall, we 
can cut-off all equations (a ■ x = (3), for (3 < [a /c], from (54). We then get the disjunction 

\J (a • x = a) , 

where A' is the set of all elements of A greater or equal to [ao/c] (in other words, all values greater 
or equal to [ao/c] that a ■ x can get over 0, 1 assignments to x). Using the Weakening rule of R(lin) 

(if necessary) we can arrive finally at the desired disjunction (a ■ x > [ao/c]), which concludes the 
R(lin) simulation of R(CP*)'s inference Rule (5). ■ 



Appendix A. Feasible Monotone Interpolation 

Here we formally define the feasible monotone interpolation property. The definition is taken 
mainly from [Kra97]. 

Recall that for two binary strings of length n (or equivalently, Boolean assignments for n prepo- 
sitional variables) a, a', we denote by a' > a that a' is bitwise greater than a, that is, that for all 
i £ [n], > a, (where and are the ith bits of a' and a, respectively). Let A(p,q),B(p,f) 
be two collections of formulas in the displayed variables only, where p, q, r are pairwise disjoint 
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sequences of distinct variables (similar to the notation at the beginning of Section 7). Assume 
that there is no assignment that satisfies both A(p, q) and B(p, r). We say that A(p, q),B(p,r) are 
monotone if one of the following conditions hold: 

(1) If a is an assignment to p and (3 is an assignment to q such that A(a, (3) = 1, then for any 
assignment a 1 > a it holds that A(a',(3) = 1. 

(2) If a is an assignment to p and (3 is an assignment to r such that B(a, (3) = 1, then for any 
assignment a.' < a it holds that B{a' ,0) = 1. 

Fix a certain proof system P. Recall the definition of the interpolant function (corresponding 
to a given unsatisfiable A{p,q) A B(p,r); that is, functions for which (39) in Section 7 hold). 
Assume that for every monotone A(p, q),B(p, r) there is a transformation from every "P-refutation 
of A(p, q) AB(p, r) into the corresponding interpolant monotone Boolean circuit C(p) (that is, C(p) 
uses only monotone gates 23 ) and whose size is polynomial in the size of the refutation (note that 
for every monotone A(p, q),B(p, r) the corresponding interpolant circuit must compute a monotone 
function; 24 the interpolant circuit itself, however, might not be monotone, namely, it may use non- 
monotone gates). In such a case, we say that V has the feasible monotone interpolation property. 
This means that, if a proof system V has the feasible monotone interpolation property, then an 
exponential lower bound on monotone circuits that compute the interpolant function corresponding 
to A(p, q) A B(p, r) implies an exponential-size lower bound on 'P-refutations of A(p, q) A B(p, r). 

Definition A.l (Feasible monotone interpolation property). Let V be a propositional refu- 
tation system. Let Ai(p,q),..., Ak(p, q) and B±(p, r), . . . , Bt(p, r) be two collections of formulas 
with the displayed variables only (where p has n variables, q has s variables and r has t variables), 
such that either (the set of satisfying assignments of) A\{p, q), . . . , Ak(p, q) meet condition 1 above 
or (the set of satisfying assignments of) Bi(p,r), . . . , Bi(p,f) meet condition 2 above. Assume 
that for any such A\(p, q), . . . , Ak(p, q) and Bi(p,r), . . . , Bg(p, r), if there exists a P-refutation for 
A-lip, q) A • • • A Ak(p, q) A Bi(p, r) A . . . A Bi(p, r) of size S then there exists a monotone Boolean 
circuit separating Ua from Vb (as defined in Section 7.1) of size polynomial in S. In this case we 
say that V possesses the feasible monotone interpolation property. 
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